Abstract. Validating the accuracy and long-term stability of terrestrial satellite data products necessitates a network of reference sites. This paper…Abstract. Validating the accuracy and long-term stability of terrestrial satellite data products necessitates a network of reference sites. This paper documents a global database of more than 2000 sites globally which have been characterized in terms of their spatial heterogeneity. The work was motivated by the need for potential validation sites for geostationary surface albedo data products, but the resulting database is useful also for other applications. The database (SAVS 1.0) is publicly available through the EUMETSAT website (http://savs.eumetsat.int/, doi:10.15770/EUM_SEC_CLM_1001). Sites can be filtered according to different criteria, providing a flexible way to identify potential validation sites for further studies and a traceable approach to characterize the heterogeneity of these reference sites. The present paper describes the detailed information on the generation of the SAVS 1.0 database and its characteristics.more
This article describes the development of a machine learning (ML)-based algorithm for snowfall retrieval (Snow retrievaL ALgorithm fOr gpM–Cross Track…This article describes the development of a machine learning (ML)-based algorithm for snowfall retrieval (Snow retrievaL ALgorithm fOr gpM–Cross Track, SLALOM-CT), exploiting ATMS radiometer measurements and using the CloudSat CPR snowfall products as references. During a preliminary analysis, different ML techniques (tree-based algorithms, shallow and convolutional neural networks—NNs) were intercompared. A large dataset (three years) of coincident observations from CPR and ATMS was used for training and testing the different techniques. The SLALOM-CT algorithm is based on four independent modules for the detection of snowfall and supercooled droplets, and for the estimation of snow water path and snowfall rate. Each module was designed by choosing the best-performing ML approach through model selection and optimization. While a convolutional NN was the most accurate for the snowfall detection module, a shallow NN was selected for all other modules. SLALOM-CT showed a high degree of consistency with CPR. Moreover, the results were almost independent of the background surface categorization and the observation angle. The reliability of the SLALOM-CT estimates was also highlighted by the good results obtained from a direct comparison with a reference algorithm (GPROF).more
Averaging a set of individual measurements can reduce the stochastic error but can introduce a sampling error particularly for irregularly sampled dat…Averaging a set of individual measurements can reduce the stochastic error but can introduce a sampling error particularly for irregularly sampled data. We present a general method to estimate the total error of an averaged quantity as a combination of the measurement error and the sampling error without knowledge about the true average value of the distribution. Our approach requires covariance matrices connecting the retrieved measurement values to an independent reference data set. These covariance matrices can be obtained from a representative validation data set. We confirm the validity of the method by estimating the temporal sampling error of monthly mean cloud fractional cover (CFC) data derived from the Spinning-Enhanced Visible and Infrared Imager radiometer onboard the METEOSAT Second Generation (MSG) spacecraft, operated by the European Organization for the Exploitation of Meteorological Satellites. The estimated sampling errors are then compared with the true sampling errors calculated from an hourly sampled complete data set. For this purpose, we use ten sampling scenarios. Some of them address typical sampling problems like systematic over- and undersampling as well as hourly, daily, and random data gaps. Two additional sampling scenarios are directly related to the satellite application facility on climate monitoring monthly mean CFC data record. These are used to estimate the worst case sampling errors of this data record. The estimated total and sampling errors agree well with corresponding calculated values. We derive the needed covariance matrices by analyzing synoptic observations of the cloud fraction which are MSG diskwide available, the majority of them over European land surfaces. The method is not limited to temporal averaging cloud fraction data. Moreover, it is a general method that is also applicable to temporal and spatial averaging of other parameters as long as appropriate covariance matrices are available.more
The low accuracy of satellite cloud fraction (CF) data over the Arctic seriously restricts the accurate assessment of the regional and global radiativ…The low accuracy of satellite cloud fraction (CF) data over the Arctic seriously restricts the accurate assessment of the regional and global radiative energy balance under a changing climate. Previous studies have reported that no individual satellite CF product could satisfy the needs of accuracy and spatiotemporal coverage simultaneously for long-term applications over the Arctic. Merging multiple CF products with complementary properties can provide an effective way to produce a spatiotemporally complete CF data record with higher accuracy. This study proposed a spatiotemporal statistical data fusion framework based on cumulative distribution function (CDF) matching and the Bayesian maximum entropy (BME) method to produce a synthetic 1∘ × 1∘ CF dataset in the Arctic during 2000–2020. The CDF matching was employed to remove the systematic biases among multiple passive sensor datasets through the constraint of using CF from an active sensor. The BME method was employed to combine adjusted satellite CF products to produce a spatiotemporally complete and accurate CF product. The advantages of the presented fusing framework are that it not only uses the spatiotemporal autocorrelations but also explicitly incorporates the uncertainties of passive sensor products benchmarked with reference data, i.e., active sensor product and ground-based observations. The inconsistencies of Arctic CF between passive sensor products and the reference data were reduced by about 10 %–20 % after fusing, with particularly noticeable improvements in the vicinity of Greenland. Compared with ground-based observations, R2 increased by about 0.20–0.48, and the root mean square error (RMSE) and bias reductions averaged about 6.09 % and 4.04 % for land regions, respectively; these metrics for ocean regions were about 0.05–0.31, 2.85 %, and 3.15 %, respectively. Compared with active sensor data, R2 increased by nearly 0.16, and RMSE and bias declined by about 3.77 % and 4.31 %, respectively, in land; meanwhile, improvements in ocean regions were about 0.3 for R2, 4.46 % for RMSE, and 3.92 % for bias. The results of the comparison with ERA5 and the Meteorological Research Institute – Atmospheric General Circulation model version 3.2S (MRI-AGCM3-2-S) climate model suggest an obvious improvement in the consistency between the satellite-observed CF and the reanalysis and model data after fusion. This serves as a promising indication that the fused CF results hold the potential to deliver reliable satellite observations for modeling and reanalysis data. Moreover, the fused product effectively supplements the temporal gaps of Advanced Very High Resolution Radiometer (AVHRR)-based products caused by satellite faults and the data missing from MODIS-based products prior to the launch of Aqua, and it extends the temporal range better than the active product; it addresses the spatial insufficiency of the active sensor data and the AVHRR-based products acquired at latitudes greater than 82.5∘ N. A continuous monthly 1∘ CF product covering the entire Arctic during 2000–2020 was generated and is freely available to the public at https://doi.org/10.5281/zenodo.7624605 (Liu and He, 2022). This is of great importance for reducing the uncertainty in the estimation of surface radiation parameters and thus helps researchers to better understand the Earth's energy imbalance.more