1. Introduction
Land surface evapotranspiration (ET) is an important hydrological process that controls water and energy transfer between soil, vegetation, and atmosphere. Quantification of ET is extensively needed in many agricultural, forest, and climatic applications. At local to regional scales, information on the spatial and temporal variations of ET is vital for drought monitoring, ecosystem health assessment, water requirement evaluation, and agricultural water allocation and management [
1]. At regional to global scales, quantifying ET is essential for mesoscale and global circulation modeling as it affects water and energy cycles between land, ocean, and atmosphere [
2]. As satellite remote sensing techniques can provide viable approaches for land surface process monitoring over large areas, various approaches have been developed for ET estimation from moderate to low resolutions based on remotely sensed data during the last few decades. These approaches generally include empirical/statistical methods that link land surface temperature (LST) and/or vegetation indices (VI) with
in situ ET measurements, and more physically-based models including surface energy balance (SEB) models such as Surface Energy Balance Algorithm for Land (SEBAL) [
3,
4], Mapping EvapoTRanspiration with Internalized Calibration (METRIC) [
5], Atmosphere-Land Exchange Inverse (ALEXI) model [
6], and Surface Energy Balance System (SEBS) models [
7], and Penman–Monteith models [
8].
The Moderate Resolution Imaging Spectroradiometer (MODIS) global evapotranspiration products (MOD16) are the first 1 km land surface ET dataset for the 109,030,000 km
2 global vegetated land surface at 8-day, monthly, and annual intervals [
9]. The products were developed using a Penman-Monteith approach driven by MODIS derived surface albedo, land cover, Leaf Area Index (LAI), the Fraction of absorbed Photosynthetically Active Radiation (FPAR), and daily surface meteorological parameters [
9,
10]. Various studies have evaluated the products over different regions and shown that they agree well with the other satellite or model-based ET datasets, such as the EUropean Organisation for the Exploitation of METeorological SATellites (EUMETSAT) Satellite Application Facility on Land Surface Analysis (LSA-SAF), ET product from Meteosat Second Generation (MSG) [
11] and the European Centre for Medium-Range Weather Forecasts (ECMWF) (GLDAS) ET images [
12]. The evaluation against ground measurements showed that the errors of MODIS ET products were around 0.31 mm/day, or 24% of the ET measured from 46 AmeriFlux eddy covariance flux towers [
10]. It was also reported that the products had better accuracy in temperate and fully humid climates, but underestimated ET in semiarid climates compared to 15 CarboEurope eddy covariance tower sites [
11]. Currently, the MODIS ET products have been used in many applications such as terrestrial gross primary productivity (GPP) estimation [
13], drought monitoring [
14], groundwater recharge quantification [
15,
16], and watershed runoff simulation [
17].
Although MODIS ET products have provided a great opportunity for routine monitoring of land surface conditions at mesoscale, the 1 km ET estimates tend to be inaccurate and are not applicable at field, local, or basin scale because of a high level of spatial heterogeneity of land cover within a pixel. Especially for agriculture applications, MODIS resolution of 1 km is too coarse and may cause significant ET errors since a single pixel is generally larger than individual crop fields. During the last five years, a few methodologies have been proposed for the downscaling of ET from coarser resolution remotely sensed data using information provided by higher resolution imagery. Some studies [
18,
19,
20] introduced disaggregation schemes that have been applied in thermal data fusion by examining the linear relationship between lower and higher resolution ET from multisensor datasets; other research [
21,
22] made use of image fusion methods such as the spatial and temporal adaptive reflectance fusion model (STARFM) that was designed for surface reflectance blending. Some studies aimed at improving spatial details of coarse resolution ET based on fine resolution remotely sensed data [
19,
21], while others aimed at generating both high spatial and temporal resolution ET [
21,
22]. A comprehensive review by Ha
et al. [
23,
24] stated that although downscaling methods have been widely developed to retrieve high resolution air temperature, precipitation and soil moisture, the approaches specifically for ET downscaling have not been fully explored. The study reviewed 28 downscaling methods used in various disciplines including hydrology, meteorology, image processing, and geostatistics, and addressed the potential of introducing these methods for ET downscaling.
The existing approaches developed specifically for ET downscaling mainly involved generation of ET from both MODIS and Landsat images, and then the relationship between the two ET products was evaluated to derive daily [
18,
22], monthly [
19] or seasonal [
20] ET at Landsat resolution. This indicates that the accuracies of downscaled ET are dependent on both MODIS and Landsat ET calculations, which are mostly based on SEB models such as METRIC and SEBAL. Although SEB models have been widely used for Landsat-series data including Landsat 5 Thematic Mapper (TM) and Landsat 7 Enhanced Thematic Mapper plus (ETM+) [
3,
4,
5,
6,
7,
8], they have not been used for recently launched Landsat 8 data. Research has revealed that estimation of daily ET at pixel level could be very sensitive to LST, meteorological input parameters such as net radiation, air temperature, and reference ET calculated with these parameters [
25,
26]; some parameters are satellite sensor specific, which have been well established for MODIS and Landsat TM/ETM+ sensors, but not for Landsat 8 Operational Land Imager (OLI) or Thermal Infrared Sensor (TIRS) sensors yet. Conventional SEB models that were tuned for Landsat TM/ETM+ sensors must be used with caution because the new sensors vary with their predecessors in that the spectral bands and the instrument performances are different [
27].
In this paper, we introduced machine learning approaches for MODIS ET downscaling. Although machine learning algorithms have been successfully used in downscaling land parameters in other topics such as downscaling of General Circulation Models (GCMs) outputs [
28] or latent heat flux estimates from Noah land surface model [
29], and also have been used in downscaling satellite-derived soil moisture [
30], they have not been applied in remotely sensed ET downscaling [
23,
24]. The objective of this study is to develop a machine learning-based method in order to generate 8-day ET maps at 30 m resolution, based on MODIS 8-day ET product (MOD16A2) and Landsat 8 data, aiming to produce higher resolution ET that is consistent with the MODIS ET product. The method assumed that vegetation and surface wetness conditions do not change considerably within an 8-day interval. We examined three machine learning approaches—Cubist, Support Vector Regression (SVR), and Random Forest (RF)—for downscaling of MODIS 1 km 8-day ET product (MOD16A2) using eleven 30-m Landsat 8 data-derived variables (indices), including surface albedo, LST, Normalized Difference Vegetation Index (NDVI), Enhanced Vegetation Index (EVI), Soil Adjusted Vegetation Index (SAVI), Modified Soil Adjusted Vegetation Index (MSAVI), Normalized Difference Moisture Index (NDMI), Normalized Difference Water Index (NDWI), Depth of Landsat 8 OLI band 6 (D
1609), Normalized Difference Infrared Index—Landsat 8 OLI band 7 (NDIIb7) and Temperature Vegetation Dryness Index (TVDI). Among the eleven indices, LST has been successfully used for ET estimation due to its sensitivity to local moisture status; surface albedo has been used as an input parameter for various SEB models; vegetation greenness indices including NDVI, EVI, SAVI and MSAVI are associated with growing status of vegetation; and wetness indices including NDWI, NDMI, D
1609, and NDIIb7 are highly correlated with water consumption that is reflected from ET processes. The method assumed that vegetation and surface wetness conditions do not change considerably within an 8-day interval, and the growing status of vegetation is highly correlated with water consumption that is reflected from ET processes. Different from previous ET downscaling methods, the machine learning models in our study do not need to calculate Landsat-based ET and thus do not rely on the accuracy of meteorological parameters. The models were tested at four study sites within different climate zones in the United States and South Korea, and evaluated using
in situ flux tower measurements.
6. Conclusions
In this study, we proposed a machine learning-based method for MODIS ET downscaling with Landsat 8 data. Eleven indicators, including albedo, LST, and VIs such as NDVI, EVI, SAVI, MSAVI, NDMI, NDWI, D1609, NDIIb7 and TVDI, derived from Landsat 8 data, were used as predictor variables and MODIS 8-day 1 km ET product (MOD16A2) as a response variable to build machine learning models. Among the three machine learning algorithms including SVR, Cubist, and RF examined, RF and Cubist resulted in higher model accuracies. rRMSEs from RF models were less than 20% for the four study sites in the US and South Korea, which were within the error range of the MODIS ET product (rRMSE around 25%). The variable importance analysis showed that vegetation growth-related indicators such as NDVI, EVI, and SAVI were most important in 8-day ET modeling by RF, while vegetation moisture-based indicators such as NDIIb7 and NDWI had slightly lower importance. LST-related indicators representing temperature at the time of image acquisition were less important than either vegetation greenness or wetness indicators. The predicted Landsat downscaled ET had an overall good agreement with MODIS ET, as the average rRMSE was around 22%. The downscaled ET product showed a similar temporal trend as MODIS ET. In addition, the product demonstrated higher spatial details; especially in crop areas, the downscaled product could represent ET variation at patch scale. Neither MODIS ET nor Landsat downscaled ET product yielded a very good agreement with in situ ET probably because of land cover heterogeneity around flux tower sites, however, the downscaled ET estimates were closer to in situ observations (R2 = 0.76 and 0.77, RMSE = 11.8 and 12.5 mm/8 days for Site 1 and Site 2, respectively). The results showed the potential of using machine learning approaches for ET downscaling considering their effectiveness and ease of implementation. Our next study will explore spatial-temporal fusion methods that combine the advantages of high spatial resolution of Landsat data and high temporal resolution of MODIS ET in order to generate 8-day ET at 30 m resolution.