Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 14, 2021 5529 Improving Chlorophyll-A Estimation From Sentinel-2 (MSI) in the Barents Sea Using Machine Learning Muhammad Asim , Camilla Brekke , Member, IEEE, Arif Mahmood, Torbjørn Eltoft , Member, IEEE, and Marit Reigstad Abstract—This article addresses methodologies for remote sensing of ocean Chlorophyll-a (Chl-a), with emphasis on the Barents Sea. We aim at improving the monitoring capacity by integrating in situ Chl-a observations and optical remote sensing to locally train machine learning (ML) models. For this purpose, in situ measurements of Chl-a ranging from 0.014–10.81 mg/m3 , collected for the years 2016–2018, were used to train and validate models. To accurately estimate Chl-a, we propose to use additional information on pigment content within the productive column by matching the depth-integrated Chl-a concentrations with the satellite data. Using the optical images captured by the multispectral imager instrument on Sentinel-2 and the in situ measurements, a new spatial windowbased match-up dataset creation method is proposed to increase the number of match-ups and hence improve the training of the ML models. The match-ups are then filtered to eliminate erroneous samples based on the spectral distribution of the remotely sensed reflectance. In addition, we design and implement a neural network model dubbed as the ocean color net (OCN), that has performed better than existing ML-based techniques, including the Gaussian process Regression (GPR), regionally tuned empirical techniques, including the ocean color (OC3) algorithm and the spectral band ratios, as well as the globally trained Case-2 regional/coast colour (C2RCC) processing chain model C2RCC-networks. The proposed OCN model achieved reduced mean absolute error compared to the GPR by 5.2%, C2RCC by 51.7%, OC3 by 22.6%, and spectral band ratios by 29%. Moreover, the proposed spatial window and depth-integrated match-up creation techniques improved the performance of the proposed OCN by 57%, GPR by 41.9%, OC3 by 5.3%, and spectral band ratio method by 24% in terms of RMSE compared to the conventional match-up selection approach. Index Terms—Barents sea, Chlorophyll-a (Chl-a) monitoring, ocean color (OC). Manuscript received October 8, 2020; revised January 16, 2021 and March 9, 2021; accepted April 7, 2021. Date of publication April 22, 2021; date of current version June 8, 2021. The work was supported in part by the Nansen Legacy Project, RCN under Project 276730 and in part by “Centre for Integrated Remote Sensing and Forecasting for Arctic Operations” (CIRFA), RCN under Project 237906. (Corresponding author: Muhammad Asim.) Muhammad Asim, Camilla Brekke, and Torbjørn Eltoft are with the Department of Physics and Technology UiT, Arctic University of Norway (UiT), 9019 Tromsø, Norway (e-mail: muhammad.asim@uit.no; camilla.brekke@uit.no; torbjorn.eltoft@uit.no). Arif Mahmood is with the Department of Computer Science, Information Technology University of the Punjab, Lahore 5400, Pakistan (e-mail: arif.mahmood@itu.edu.pk). Marit Reigstad is with the Department of Arctic and Marine Biology UiT, Arctic University of Norway, 9019 Tromsø, Norway (e-mail: marit.reigstad@uit.no). Digital Object Identifier 10.1109/JSTARS.2021.3074975 I. INTRODUCTION HE Barents sea is a large Arctic shelf that covers about 10% of the Arctic Ocean [1]. The northern part of the Barents Sea is seasonally ice-covered while the southern part is sea-ice-free due to the inflow of salty, warm, and nutrient-rich waters from the Atlantic Ocean through the Nordic Seas [2]. Almost 40% of the total Arctic primary production occurs in the Barents Sea and hosts Norway’s richest commercial fisheries [3]. However, the Barents Sea is experiencing significant changes due to the result of global warming. The increased inflow of Atlantic water has caused up to a 50% reduction in sea-ice covered region in the last decade [4]. Due to sea-ice loss and weaker stratification of the water column, the sea under the melting ice in the Barents Sea is exposed to prolonged exposure of sunlight during summer and fall, which has increased the production and seasonal growth of phytoplankton [5], [6]. The effect of altered physical conditions in different seasons on the primary productivity is therefore crucial to investigate the ecosystem of the lately changing Barents Sea. It is within this context, the current study is aimed at developing new methods that can more accurately track phytoplankton biomass variability in the Barents Sea. Phytoplankton are recognized as valuable indicators of marine ecosystem health, quality of water, and are sensitive to climate changes [7]. As a light-harvesting pigment in phytoplankton, Chlorophyll-a (Chl-a) is regarded as a proxy for biomass in the water column [8]. Phytoplankton form the bases of aquatic food webs and can grow rapidly in a short period depending on the availability of nutrients, sunlight, nitrogen, or phosphorus concentration [6], [9]. An excessive concentration of phytoplankton harms the fishery, local economy, marine animals, and public health [10], therefore, making it critical to carefully evaluate the exact concentration of Chl-a. Several studies have been conducted on modeling the net primary production and Chl-a content in the Barents Sea, though, many are solely based on in situ measurements [6], [11]–[15]. Several methods integrating in situ with satellite-based observations have also been proposed [1], [16]–[23]. These studies on Chl-a retrieval are either based on empirical or semianalytical approaches and confined to relatively small spatial and temporal scales. Some of the existing methods are applied to in situ remote sensing reflectance (Rrs ) data and validated on either T This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ 5530 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 14, 2021 low spatial resolution satellite sensors or limited to validation on a few images [23]. For example, Le et al. [1] used a 3-D sea-ice plankton ecosystem model to study primary production in the northern Barents Sea for only summer months. Engelsen et al. proposed an empirical method to estimate Chl-a content across the water columns using sea-viewing wide field-of-view sensor (SeaWiFS) data confined to the early bloom season [17]. Kogeler et al. used an empirical model based on blue/green ratio to estimate Chl-a using only 35 images acquired from CZCS sensor [18]. Dalpadado et al. divided the Barents Sea region into 15 polygons and computed correlation between the mean of in situ Chl-a samples and all valid Chl-a pixels from SeaWiFS and MODIS Aqua, within a polygon [6]. More recently, a bio-optical model was developed from a set of in situ observations of Chl-a and inherit optical properties (IOP’s) collected only in the bloom season. Due to cloud cover and longer time-gaps, the estimated Rrs spectra derived from IOP’s were validated with an eight-day average MODIS-A observation [23]. Thus, most of the existing methods are not validated independently on high-resolution satellite data such as Sentinel-2 multi spectral instrument (MSI) covering a wide area of Barents Sea. Considering the importance of a long-term monitoring of water quality, the need to develop a reliable algorithm to accurately estimate Chl-a in the transitional Barents Sea is needed. Recently, with the increasing popularity of ML in the field of remote sensing, several ML-based methods have been proven effective in retrieving Chl-a from water bodies. However, for the Barents Sea, to the best of our knowledge, no thorough study has been reported on Chl-a estimation using ML techniques integrated with remotely sensing data. The most widely explored ML methods include artificial neural networks (ANNs) [24], support vector regression (SVR) [25], relevance vector regression (RVR) [26], random forests (RF) [27], Gaussian process regression (GPR) [28], [29], and mixture density networks (MDN) [8]. The ANNs due to their ability to learn highly, nonlinear relationships have attracted many researchers [24], [30]–[32]. However, in most of these existing studies, built-in software ANN modules have been utilized; therefore, the architecture of ANNs has not been well explored despite their potential effectiveness in estimating nonlinear functions. The current study explores the architecture and ability of MLP-based deep ANNs in detail to accurately map water leaving Rrs to Chl-a concentrations for the Barents Sea, which is a novel application area. In most of the existing studies [8], [24], [30], researchers have associated surface or near-surface Chl-a concentration ([Chl-a]surf ) at some discrete depths with the water leaving Rrs . This approach restricts Chl-a estimation to the upper layer of the water column while the solar radiation is not restricted only to the near-surface. Depending on the IOP’s of the water body, scattering and absorption, radiation can penetrate deeper, and a satellite will capture the integrated effect across the water column. Moreover, in the biogeochemical applications such as primary production estimation or investigating the vertical distribution of algal species, the near-surface Chl-a content estimated by ocean color (OC) sensors is insufficient to track the algal biomass in the entire depth range, where algae can live and grow [33]. Therefore, in the current work, we propose to integrate Chl-a across water columns depending on the light penetration depth (Zpd ) in order to accurately estimate the primary production. In some of the previous studies, a median or mean Rrs value over a spatial window has been associated with the in situ Chl-a samples [34]–[36]. Warren et al. resampled all the spectral bands to a common spatial resolution and used the central pixel in the window [37]. In contrast to the existing approaches, we propose to use all valid pixels in a spatial window without taking mean or median of the Rrs values. Our approach increases the match-up dataset size and improves the overall performance of the proposed model. Besides, it also improves the performance of existing empirical and ML methods in estimating Chl-a in open ocean waters such as the Barents Sea. Matching each in situ measurement of Chl-a to all valid pixels in a window results in estimating multiple values of Chl-a. The median over these estimated values is then computed, which is a more robust estimate of Chl-a. In addition, we also propose a filtering criterion based on the spectral distribution of Rrs . After applying the recommended atmospheric correction (AC) quality flags [37], [38], the match-ups are further processed to remove the nonphysical and unrealistic measurements in-terms of spectral distribution and amplitude that arise due to the time-gap or uncertainty in the AC algorithm. The systematic system diagram illustrating the main components of the proposed methodology is given in Fig. 1. The major contributions of the present study are as follows. 1) In the current work, we analyze various techniques for match-up selection and Chl-a retrieval from the Barents sea. 2) To account for the uncertainty in the remotely sensed data, we also propose a match-up dataset filtering method based on the concentration of Chl-a and spectral distribution of Rrs . 3) We propose to retrieve depth-integrated Chl-a to track the phytoplankton bloom appearing down the water column for a more accurate estimation of the biomass. 4) By combining the proposed data augmentation technique with the depth-integrated-average Chl-a, we formulate a novel Chl-a estimation framework that enhances the performance of the proposed as well as compared methods. 5) To improve the Chl-a estimation accuracy in the subArctic waters, we propose a neural network-based algorithm dubbed as OCN. 6) The proposed match-up dataset creation, data augmentation, and depth integration techniques have improved the Chl-a retrieval performance of all the methods considered in this study. The proposed OCN model has outperformed all the compared methods. The remaining of the article is organized as follows. Section II presents related work, whereas Section III is devoted to material and satellite data acquisition. The match-up selection and ML methodologies are presented in Section IV and V, and the experimental results are discussed in Section VI. Finally, Section VII concludes the article. ASIM et al.: IMPROVING CHL-A ESTIMATION FROM SENTINEL-2 (MSI) IN THE BARENTS SEA USING ML 5531 Fig. 1. Proposed OCN framework for estimating Chl-a. (a) Input top-of-atmospheric reflectance (ρrs ) (Section III-B). (b) (ρrs ) is corrected for atmospheric effects to extract surface Rrs (Section IV-A). (c) Window over Rrs pixels centered around the in situ location (Section IV-C). (d) Filtering block to get valid Rrs pixels (Section IV-B). (e) Features extraction block over the valid Rrs pixels (Section V-A). (f) Input layer of the FC neural network. (g) First fully connected block consisting of an FC and batch normalization (BN) layers (Section V-A2). (h) Second FC block with FC and BN layers. (i) In situ Chl-a sampling (Section III-A). (j) In situ depth integration block (Section IV-D). (k) Network loss computation (3). (l) Output of the network over the window on the test dataset. (m) Information fusion block (Section IV-C). (n) Output of the proposed framework, Chl-a. (o) Chl-a maps. II. RELATED WORK OC remote sensing is a practical and powerful tool in the monitoring of aquatic environments and providing estimates of near surface concentration of water quality parameters such as Chl-a in open ocean [39], [40], coastal waters [41], as well as inland waters [42]. Existing Chl-a retrieval algorithms may be divided into two categories, analytical approaches and empirical methods [7]. Most analytic approaches consist of two steps, derivation of the IOPs that determine the color of water, followed by estimation of Chl-a content. In the empirical approaches, Chl-a concentration is estimated directly from Rrs , also known as the inversion approach. The empirical methods rely on the estimates of phytoplankton absorption peaks within the blue and red portions of the spectrum [43], [44]. Chl-a in the open ocean waters has been estimated using the ratio of blue to green bands, which assumes that the shape and magnitude of Rrs spectrum between blue and green bands is primarily driven by the concentration of Chl-a with minimum effect from other organic and inorganic substances [7]. Previous studies have shown that the blue-to-green ratio has a strong correlation with Chl-a in clear waters. The polynomial coefficients in the ocean color (OC) algorithm [45], where the blue-to-green ratio of Rrs (λ) statistically relates to Chl-a through a polynomial expression, have been tuned according to the spectral configuration of various satellite sensors. More recently, 65 polynomial expressions were developed for 25 satellites utilizing 2720 pairs of coincident Chl-a and corresponding Rrs [45]. The Rrs spectrum in coastal and inland waters is affected by the presence of other constituents, which often leads to an overestimation of Chl-a [8], [46]. Therefore, several other empirical formulations have also been proposed, including the red-edge ratio methods [47]–[49], the line height (LH) method [50], hybrid methods [51], and ML-based methods [24], [30]–[32]. Level-2 products from Sentinel-2 MSI, ocean and land color imager (OCLI) onboard Sentinel-3, and AC processors such as Acolite, C2RCC, and Seadas estimate Chl-a using band ratios, semianalytical methods, or ML methods such as NNs, which are trained globally on a large amount of simulated data. Efficient retrieval of Chl-a across all water types using a single method is quite challenging. Smith et al. suggested that an algorithm should be locally trained to learn the nonlinearity of the functional dependence between the reflected water leaving radiance and Chl-a concentrations [52]. More recently, ML-based 5532 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 14, 2021 TABLE I DESCRIPTIVE STATISTICS OF IN SITU CHL-A CONCENTRATIONS AT VARYING DEPTHS DURING YEARS 2016–2018 IN THE BARENTS SEA The in situ data are collected as part of the ecosystem monitoring program, IMR methods trained locally on the area under observation have attracted researchers due to the improved performance [27], [29], [32], [53]. Most of the abovementioned methods associate in situ measurements with satellite observation of 3×3 [34], [35], [54] or 5×5 pixels window [36], centered around the in situ location. A mean or median of cloud-free and valid pixels is computed to extract a single value of Rrs for each in situ sample. Pu et al. utilized convolutional neural network (CNNs) to model the relationship between Landsat-8 images and in situ water-quality levels by considering a spatial window of 1 km2 (7×7 pixels) at each monitoring station [55]. Pyo et al. [56] also developed a CNN-based regression model to estimate Chl-a concentrations using hyperspectral images acquired from an airborne sensor. They used a window of 8 × 8 pixels for extracting the nonlinear spatial features of the algal pigment. These regression models based on CNN require a large amount of cloud-free dataset with minimum time-difference between the in situ and remote sensing data. To address this issue Pyo et al. [56] used airborne hyperspectral imagery to train the CNN, which is much more costly than using freely available satellite image data. Moreover, these approaches are based on a fixed window size, which may include invalid pixels depending on the observation conditions. In contrast, in the current work, we propose an NN-based on multi layer perceptron (MLP), with the flexibility to remove invalid pixels from each window, that can work efficiently for smaller datasets. III. DATASET ACQUISITION In this section, we discuss the collection of in situ Chl-a data and the overlapping satellite observations. A. In Situ Observations The Barents Sea is one of the most productive oceanic areas in the world, and it has an average depth of 230 m with a total area of 1.5 million km2 [6]. A sampling of conductivitytemperature-depth (CTD) fluorescence of Chl-a were carried out in the years 2016–2018, as part of the Ecosystem Program of the Institute of Marine Research (IMR), Norway. The Chl-a CTD data were collected from a vast region in the Barents Sea, covering various oceanographic conditions. In addition to the samples from the surface, Chl-a measurements were also collected at different discrete depth intervals up to 100 m. Data were collected from various CTD stations; 232 in year 2006, 405 in year 2017, and 424 in the year 2018, respectively. The Chl-a concentration varies from 0.014 to 10.81 mg/m3 . The in situ measurements were collected throughout the year; however, measurements from April to October are used in this study. The remaining months remain dark with insufficient and extremely low solar elevations, making remote sensing unsuitable for OC monitoring. The monthly and yearly variation in Chl-a content across water columns is shown in Table I. The spatial locations of in situ data are shown in Fig. 2(a)–(c). B. Satellite Image Data Acquisition Sentinel 2 A/2B on-board MSI from the European Space Agency (ESA) with a swath of 290 km each, are in the same orbit and 180° apart from each other. The revisit time of Sentinel2 A/2B is 10 days (of each satellite) at the equator, meaning that the twin satellites revisit the same area every five days, with a wide field of view, covering land and coastal areas [57], [58]. In order to reacquire a cloud-free image of a specific area, it may take significantly more time, depending on the weather conditions. Note that a cloud cover is much more persistent in ASIM et al.: IMPROVING CHL-A ESTIMATION FROM SENTINEL-2 (MSI) IN THE BARENTS SEA USING ML Fig. 2. 5533 Study region and locations of in situ observations of Chl-a (black dots) and match-ups (red color) from year (a) 2016, (b) 2017, and (c) 2018. TABLE II SENTINEL-2 SPECTRAL BANDS WITH SNR AT THE REFERENCE RADIANCE L_REF the high-Latitude areas such as the Barents Sea. The Sentinel2 A/2B mission provides the reflected solar spectral radiances in 13 spectral bands in the visible, infrared, and short-wave infrared part of the electromagnetic spectrum. As shown in Table II, among the 13 spectral bands, the four bands centered at 493 nm (blue), 560 nm (green), 665 nm (red), and 842 nm (NIR) have a spatial resolution of 10 m. These four bands are suitable for the retrieval of biogeochemical products and IOP’s of the water column. The six bands centered at 705, 740, 783, 864 nm, [covering the visible and near infrared (VNIR) region] 1610, and 2190 nm [covering short wave infra-red (SWIR) region] have a spatial resolution of 20 m. These six bands are suitable for applications such as snow, ice, and cloud masking. The remaining three bands centered at 443, 945, and 1375 nm, have a spatial resolution of 60 m and suitable for AC and cloud screening. These bands are also used for aerosols retrieval, water vapor correction, and cirrus detection [59]. Sentinel-2 acquire spectral observations from −56° to 84° latitude [59], therefore, suitable for OC monitoring in the Barents Sea. Sentinel-2 A/2B Level1-C (L_1 C) data, colocated in space and with a time gap within ±1 d of the in situ observations for the period 2016–2018 (April–October) having a cloud coverage of ≤30%, is acquired from.1 The L_1 C product provides geocoded top-of-atmospheric (TOA) reflectance, with associated cloud, land/water mask, and quality flags. To ensure 1 Online. [Available]: https://scihub.copernicus.eu/dhus cloud-free pixels in a window of 3 × 3 pixels, centered at the in situ observation location, the Sentinel-2 L_1 C built-in cloud mask was applied in the sentinel application platform (SNAP) v6.0 processing toolbox, prior to applying AC. The cloud mask enables to identify both cirrus and dense clouds. The dense clouds have a high reflectance in the blue wavelength (493 nm). If the reflectance in the blue band is greater than a threshold, that pixel is identified as covered by dense clouds, also known as opaque clouds [60]. Cirrus clouds are thin and semitransparent and usually formed approximately at 6–7-km above the Earth’s surface. The high-atmospheric absorption in band-10 (1375 nm) makes the detection of cirrus clouds possible. A time window of ±1 d between in situ and satellite measurements was used to find match-ups. For comparison, Warren et al. [37] allowed a window of ± 1 d for inland waters, Kuhn et al. [61] allowed a time window of ≤ ±1 day for three different rivers while Le et al. [49] and Pan et al. [62] allowed a window of ±24 h and ±8–32 h, respectively, for coastal waters to obtain a sufficient number of valid match-up pairs for algorithmic validation. More recently, a larger time-window of ± 2 days was used by Liu et al. [46] for 36 different water bodies, including coastal waters, inland lakes, reservoirs, and rivers in the United States and China. If the pixels of interest in the acquired scene corresponding to the in situ location are identified as invalid or defective, then the next scene within the specified time window is analyzed. If none of these masks or quality flags are true, the pixel is considered water and processed through the AC 5534 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 14, 2021 algorithm. If the pixels of interest are found cloudy or defective in all available scenes, then that in situ observation is discarded. IV. MATCHUP SELECTION METHODOLOGIES In this section we present different proposed match-up selection methodologies. Before using the proposed matchup selection, AC is applied to convert the TOA signal to above water Rrs . A. Atmospheric Correction (AC) Algorithm AC is a crucial step in OC monitoring algorithms. The retrieved signal by satellite sensors contain < 10% of water leaving radiances, the remaining is the contribution from atmosphere [63]. The water leaving radiance is then converted to Rrs , the ratio of water-leaving radiance to the total downwelling irradiance measured just above the water surface, which carries information about the water-column and can be used to derive OC products such as Chl-a concentration [54]. Prior to applying AC, Sentinel-2 L-1 C data were resampled to a spatial resolution of 60m [37]. This spatial resolution is selected to improve the signal-to-noise ratio and enable the application of AC. The resampled TOA reflectances Rrs were then atmospherically corrected into the water-leaving Rrs (sr −1 ) using the C2RCC AC processor. The choice of C2RCC is motivated due to its good performances in [37] and [38]. The C2RCC processor is based on the ANN method, where the ANNs are trained on a large database of simulated waterleaving reflectances and related TOA radiances. The trained ANN is then used to perform the inversion of TOA radiances to water-leaving radiance. Moreover, it also generates Chl-a maps and estimates the IOP’s of the water body. The C2RCC is a modified form of previous Case2Regional and CoastColour AC algorithms. In this study, compared to the other AC processors such as Acolite and Sen2Cor, it has better preserved the spectral shape in both bloom and nonbloom Barents Sea waters in the blue, green, and NIR bands. The C2RCC processor is provided in the SNAP processing toolbox from ESA. In addition to calibrated TOA reflectances, the C2RCC requires salinity, ozone, air pressure, and temperature as input parameters. The average temperature and salinity were set to 8◦ C and 34.5 PSU following Climate Explorer.2 The remaining parameters were set to default values [37]. Any pixel corresponding to the in situ measurements that passed the recommended quality flags [37] is considered a potentially valid pixel and selected for further processing. The quality flags used in the current study include Cloud_risk, Rtosa_OOS, Rhow_OOS, and VALID_PE. The Cloud_risk flag indicates cloudy conditions, and any pixel affected by clouds was excluded. The Rtosa_OOS flag is true when the input spectrum to the C2RCC-net algorithm is out of the training range; therefore, the inversion of TOA Rrs to surface Rrs is most likely to be incorrect. The Rhow_OOS flag is true when the input spectrum to the IOP neural net is not within the 2 Online. [Available]: https://climexp.knmi.nl/ training range of the neural net. The inversion is likely to be wrong in this case as well. The VALID_PE is the operator’s valid pixel expression, which is true for valid pixels and false otherwise [38]. B. Proposed One-to-One Match-Up Selection The in situ measurements of Chl-a are matched with the corresponding Rrs pixels using a baseline setting of one-to-one matching. In this matching scheme, each in situ measurement is matched to the nearest pixel in the satellite image [37]. The baseline scheme is then extended to one-to-window matching, where each in situ measurement is matched to all the valid pixels in a window of size 3×3, centered at the in situ location. The valid pixels correspond to the water leaving Rrs that pass the quality flags as well as the filtering criterion defined below. The one-to-window matching can also be considered as a data augmentation technique and it has resulted in improved performance of the proposed as well as the compared algorithms. Since the satellite data have already been resampled from 10 and 20 to 60-m resolution, instead of associating the in situ samples with a mean or median of a window of 3×3 pixels [34], [38], each Chl-a measurement was matched to the spatially closest pixel [37]. Only water pixels that passed the aforementioned quality flags were included in the match-up dataset. The time window between the in situ and satellite data significantly affects the size and quality of the match-up dataset. Allowing a longer time gap produces more match-ups but risk the reliability of the system due to the dynamic nature of water body especially in the coastal waters [37]. Considering the ocean dynamics and the larger training data requirement of ML algorithms to learn the mapping between Rrs and Chl-a concentrations, we have proposed a new match-up selection criterion based on the spectral distribution of Rrs . After applying the quality flags, potentially valid pixels are processed to remove the nonphysical and unrealistic measurements in-terms of spectral distribution and amplitude that arise due to the time-gap between the in situ and satellite data or errors in the AC algorithm. The filtering operation is performed using the shape characteristics of the spectral distribution. By carefully analyzing the samples, i.e., the in situ Chl-a and the corresponding Rrs spectra, when the time-gap between the in situ observations and satellite images is small, we observe that the Rrs spectra corresponds to the same spectral distribution as reported in previous studies [34], [64]. The data samples not following the spectral ratio criterion are outliers and therefore removed from the match-up dataset  rs (λ560nm ) If Chl-a < 1 mg/m3 then R Rrs (λ492nm ) < 1 If Chl-a ≥ 1 mg/m3 then Rrs (λ560nm ) Rrs (λ492nm ) ≥ 1. Increasing Chl-a generally result in higher reflectance across the green and NIR region of the spectrum [7], [44], [45]. CDOM, on the other hand, tends to reduce the reflectance, especially below 500 nm [37]. By carefully observing the match-ups, with an increase in the time-gap (within ±1 day), in some cases, we observe high reflectance at 492 nm instead at 560 nm despite high Chl-a concentration, which we consider as outliers. It should be ASIM et al.: IMPROVING CHL-A ESTIMATION FROM SENTINEL-2 (MSI) IN THE BARENTS SEA USING ML 5535 Fig. 3. Match-up Rrs spectra of Chl-a concentrations after filtering when (a) Chl-a < 1 and (b) Chl-a ≥ 1. Erroneous Rrs spectra when (c) Chl-a < 1 and (d) Chl-a ≥ 1. The black curves represent mean concentration of Chl-a. The time-gap between in situ and satellite data is ≤ ±1 day. TABLE III OCN MODEL PARAMETERS FOR CHL-A RETRIEVAL noted that these abnormal Rrs spectra are not due to CDOM; otherwise, the Rrs spectra, irrespective of Chl-a concentration, would have shown low reflectance in the blue wavelength mainly below 500 nm. The observed spectral behavior for Chl-a ≤ 1.0 and Chl-a >1.0 are quite different as shown in Fig. 3(a) and (b). In Fig. 3(c), it can be seen that the erroneous Rrs spectra (peaks in the green wavelength) for low concentrations of Chl-a has almost the same order of magnitude as the Rrs spectra that are physically correct and included in the match-ups [Fig. 3(b)]. We also observe that the green or NIR to red band ratios showed no significant relationship with Chl-a concentrations in match-ups or outliers. Moreover, Rrs in the NIR band is low compared to the green band and do not show significant variations. This means that the Rrs spectra are not effected by suspended solid matter. These erroneous Rrs spectra may have aroused due to the time difference between the in situ and satellite data or uncertainties in the AC algorithm. We experimentally observe that if these abnormal measurements are not removed from the training data, all the methods show degraded performance, as shown in Fig. 4 and Table V (Case iv). The proposed match-up selection technique makes the remaining set of observations consistent with the spectral behavior of Chl-a, as reported in the previous studies [7], [34]. It allows to use a larger time window to increase the match-up dataset while reducing the adverse effect caused by the temporal mismatch between the in situ and the satellite data and errors in the AC algorithm [37]. C. Proposed One-to-Window Match-Up Selection Instead of associating the in situ samples with a single nearest pixel in the satellite image, we consider associating it with all potentially valid Rrs that pass the quality flags in a window of 3×3 pixels, centered at the in situ location. Within the window, if a pixel is identified as invalid, then the mean of the remaining 5536 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 14, 2021 Fig. 4. Performance evaluation of [Chl-a]Zpd retrievals using OCN, C2RCC-net, GPR, band ratios, and OC3 algorithms using one-to-window approach without applying filtering operation. The total number of test samples are 109. The overall and range-specific performances are included in Table V (Case iv), respectively. water leaving Rrs that pass the quality flags, is used as a replacement of that pixel. Also, if more than one pixels fail to pass the quality flags then one of them is replaced by the mean of valid pixels but the remaining pixels are removed from the window to avoid leakage of data during the training of ML methods. After that, the filtering operation discussed in the last section is applied to remove the erroneous spectra. In the remaining text, the term valid pixels means the Rrs pixels that have passed the quality flags and the proposed filtering operation discussed in Sections IV-A and IV-B. The terms invalid and erroneous are considered as the same. Matching in situ Chl-a to all valid pixels in a window of 3×3 pixels increases the training and validation samples and improves the learning performance of ML methods. During testing, estimating Chl-a over a window may predict different values depending on the variability in Rrs values. To obtain final Chl-a value corresponding to the in situ measurement, fusion is performed by computing median over the predicted values. Thus our approach results in an increase in the number of match-ups and have shown improved performance of all the compared algorithms. D. Proposed Depth-Integrated Match-Up Creation In the previous sections, the one-to-one and one-to-window match-up datasets were created using the surface Chl-a in situ concentrations. The Chl-a profiles indicate that in most cases, the water samples collected at certain depths have higher concentrations of Chl-a than the surface, as illustrated in Fig. 5. Therefore, in this section, we extend both the one-to-one and one-to-window match-ups to one-to-one-depth-integrated and one-to-window-depth-integrated match-ups selection techniques. Meaning that the depth-integrated-weightedaveraged Chl-a concentration is first matched to a single pixel and then to a window of 3×3 pixels as described in the previous sections. These match-ups were made by computing depth-integrated-weighted-averaged Chl-a concentrations which turned out to be more accurate than the surface Chl-a values in estimating phytoplankton biomass. To compute the depth-integrated-weighted-averaged-Chl-a from the Chl-a concentrations measured at discrete depths z, we have followed the approach developed in Uitz et al. [33] which is based on the work [65]. Let [Chl-a]Zpd be the Chl-a concentrations presumably seen by a satellite. It may be computed over the first optical depth Zpd also known as penetration depth, as follows:  Zpd C(z)exp(−2kd z)dz [Chl-a]Zpd = 0  Zpd (1) exp(−2kd z)dz 0 where C(z) represents Chl-a concentrations collected at discrete depths, exp(−2kd z) is an exponentially decreasing function which assigns higher weight to the surface Chl-a and lower weights to the samples collected at increasing depths. The attenuation coefficient of the down-welling solar irradiance is given by kd = 4.6/Zeu , where Zeu is the euphotic-depth which may be computed for the open oceans [65] Zeu = 568.2[Ctot ]−0.746 z (2) where Ctot = 0 C(z)dz. We observed that, the penetration depth Zpd varies from 2.5–17 m with a mean of 7±2.5 m in the bloom season (April–May), as shown in Fig. 5. In the remaining months which are less productive (June–October), Zpd varies from 4–22 m with a mean 9±3.14 m. As illustrated in Fig. 5(a), ASIM et al.: IMPROVING CHL-A ESTIMATION FROM SENTINEL-2 (MSI) IN THE BARENTS SEA USING ML 5537 TABLE IV PERFORMANCE COMPARISON OF THE PROPOSED OCN ALGORITHM WITH EXISTING STATE-OF-THE-ART METHODS IN THREE DIFFERENT MATCH-UPS, EACH ESTIMATING SURFACE CHL-A [CHL-A]SURF AND DEPTH-INTEGRATED CHL-A [CHL-A]Zpd The best results are shown in bold. the maxima of Chl-a occurs in the upper column (1–12 m) in the bloom season and lies within the penetration depth. The Chl-a concentration deceases in the remaining months, however, the mean pigment profile almost show a similar trend, as depicted in Fig. 5(b). Due to the deceased concentrations of Chl-a the mean penetration depth also shows an increment of 2 m, compared to the bloom season. To create depth-integrated Chl-a concentration match-ups, we first compute [Chl-a]Zpd using (1). In order to filter out the outliers and uncertainties in the remotely sensed data we have proposed conditions based on the Chl-a spectral distributions in Section III(d). Previously we have used surface Chl-a ([Chla]surf ) in these filters, while now we use the depth-integrated averaged Chl-a, denoted by [Chl-a]Zpd . Following the match-up selection and the filtering process, 78 matched pairs are finally selected for the one-to-one scheme and 514 match-ups for one-to-window settings, which are then used to develop Chl-a concentration retrieval algorithms. 5538 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 14, 2021 TABLE V PERFORMANCE COMPARISON BETWEEN THE OCN, GPR, OC3, AND SPECTRAL BAND RATIO METHODS IN RETRIEVING [CHL-A]Zpd IN ONE-TO-WINDOW CONFIGURATION USING FOUR DIFFERENT FILTERING CASES (I-IV) AS DEFINED IN SECTION VI-B The best results are shown in bold. Fig. 5. Chl-a profiles plotted as function of geometrical depth for the year 2016–2018 in the Barents Sea (a) April–May (bloom season) and (b) June–October. The dotted lines represent some examples of Chl-a vertical distribution while the thick black lines represent the averaged Chl-a profiles over the complete dataset. V. PROPOSED MACHINE LEARNING METHODOLOGY ANNs have been proven to be efficient tools in studying nonlinear dynamic systems in various fields, including remote sensing, medicine, environmental studies, machine vision, and surveillance [66], [67]. ANNs have previously been used for Chla estimation [30]–[32]; however, to the best of our knowledge, no thorough study has been conducted to explore the efficiency of ANNs, in the domain of O monitoring in the Barents sea and Norwegian Coastal areas. This may be partially due to the unavailability of match-up datasets for the given area of observation and uncertainties associated with the remotely sensed data. In the current work, the architecture of fully connected ASIM et al.: IMPROVING CHL-A ESTIMATION FROM SENTINEL-2 (MSI) IN THE BARENTS SEA USING ML feed-forward MLP is explored for OC monitoring (Table III). It is applied to different types of match-up datasets discussed in the last section and compared with the existing state-of-the-art Chl-a retrieval techniques. In the following subsections, we explain the proposed OCN and its training process using the match-up datasets proposed in the last section. A. Chlorophyll Estimation Using OCN The proposed OCN model is trained using 10 input features utilizing the eight bands centered at 443, 492, 560, 665, 704, 740, 782, 865 nm, and the two ratios of green (560 nm) to blue bands (443 and 492 nm) due to their high sensitivity for changes in Chl-a concentrations. Each input Rrs feature is normalized between 0.00 and 1.00 before input to the OCN. The Chl-a content in mg/m3 ) is converted into log-scale before using it as target values, as proposed in the previous studies [8], [37]. It follows a normal or near-normal distribution and reduces skewness in the data. There are a number of hyperparameters to tune in this network, including the number of hidden layers, loss function, activation function, learning rates, and regularization. These choices must be carefully opted to get a more accurate output of the model. In this study, different designs of ANN with various weight initialization techniques, number of hidden layers, varying number of neurons in each hidden layer, different activation functions, regularization techniques, optimization algorithms with varying learning rate and batch-size were implemented, and the one with two hidden layers having 25 neurons each and tanh activation function is found to be the best performer based on the validation loss. We experimentally observed that the network with higher number of hidden layers and neurons is effected by overfitting. After the activation function, the batch normalization was applied after each hidden layer for regularization [68]. The output of OCN is a single value of Chl-a, which is fed into the loss function. The optimization process minimizes the difference between the estimated and the in situ Chl-a concentrations using the backpropagation algorithm. The loss function is based on the root mean square log error (RMSLE) along with ℓ2 norm on the weights and the biases, w and b   N 1  (log10 (yi ) − log10 (yi ))2 + λ1 ℓ2 (W, b) (3) L= N i=1 where yi is the predicted and yi is the corresponding groundtruth value, N is the total number of samples, and λ1 is a hyperparameter used to assign relative importance to the second term. 1) Optimization Process: The backpropagation algorithm uses a minibatch gradient descent method to compute the gradients (gt ) of the cost function w.r.t. to the weights w and biases b of the network. This algorithm aims to find model weights and coefficients that minimize the loss over a minibatch during training. The training parameters are updated using n training examples (xn , yn ) instead of a single example or whole training dataset. At each time step t the cost function is minimized as follows: wt,n = wt−1,n − ηgt,n (4) 5539 where gt,n = ∇w L, ∇w is the gradient of the loss function L defined by (3) which is differentiable w.r.t. weights. The parameter η is the learning rate which represents the amount of change induced in the weights during each minibatch iteration. In the current work, Adam optimizer is used for faster convergence of the model. The batch size is fixed to 64 samples in all experiments. The initial learning rate η0 was set 0.0075 which decreases by 2% after every 100 epochs. These two hyperparameters are tuned based on the training and validation error during the training process. In our model, the weights and biases were initialized using the Xavier method [69]. An improvement in the convergence rate and accuracy of the model was observed by initializing network weights using the Xavier method. 2) Batch Normalization: Updates in the parameters being learned in the preceding layers cause a continuous change in the distribution of inputs to the later layers, which then need to readjust according to the changed distribution, slowing down the convergence of the network. In order to avoid the internal covariance shift, batch-normalization has been applied. This is achieved by controlling the mean and variance of the input distributions. This technique reduces the internal covariance shift between layers, stabilizes, and speeds up the learning process [68]. The Chl-a performance estimation improved by >5% after the implementation of batch-normalization. For an n-dimensional input-batch x = xi...n , the batch normalized is performed as follows: xi = xi − E[xi ] var[xi ] (5) where xi is a particular input to the layer, x̂i represents the normalized input, E(xi ) is the batch mean, and var(xi ) is the variance of the batch. The output of the layer is then scaled and shifted yi = αxi + β (6) where α and β are scaling and shifting parameters which are learned during the training. B. Experimental Setup To evaluate the proposed OCN and the other ML methods, the match-ups are randomly split into 90% training and 10% testing samples. Experiments are repeated with tenfold crossvalidation. The training data in each split are further divided into training and validation (90% and 10%) splits for the one-to-one configuration and (70% and 30%) for one-to-window match-up configuration due to higher number of match-ups. Using the training data only, the proposed OCN model is trained for 5000 epochs. In order to properly tune the hyperparameters and avoid overfitting, the OCN model with weights and bias terms having minimum validation loss during the training iterations is utilized to estimate Chl-a on unseen test data. The OCN model is developed in tensor flow. The GPR is implemented in Python using Scikit-learn Machine Learning Toolkit [70] and is trained using the same training splits. Radial basis function (RBF) is used with GPR since 5540 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 14, 2021 it performs better than the linear kernel. The kernel hyperparameters are optimized on the validation split by maximizing the log-marginal-likelihood (LML) using the limited memory Broyden–Fletcher–Goldfarb–Shanno algorithm. As the LML may have multiple local minima, the optimizer is randomly initialized 10 times, and the best performer is selected. The noise level in the targets which is a value added to the diagonal of the kernel matrix during fitting is also fine-tuned. The RMSLE is computed N times during each cross-validation step and based on it, alpha is selected for the test split. A significant improvement in the GPR model is observed after fine-tuning alpha compared to the default values. In this work, two versions of the OC3 algorithm are compared, a globally trained version and a locally trained version, OC3LT . The OC3LT is trained by combining training and validation splits, as explained in Appendix A. C. Performance Indicators To compare the performance of different methods, a number of linear and log-transformed metrics are used. These metrics include the RMSLE, MSLE, MSE, mean absolute log error (MAE) computed in log-space, bias, and coefficient of determination, R2 . The metrics computed in log-space provide a better assessment of the algorithmic performance as the log-transformed data follows a normal or near-normal distribution. In addition to the above metrics, we have also included linear regression slopes to facilitate comparison between different methods   Nt  1  ((yi ) − (yi ))2 RMSE =  Nt i=1   Nt  1  RMSLE =  (log10 (yi ) − log10 (yi ))2 Nt i=1 MSE = Nt 1  ((yi ) − (yi ))2 Nt i=1 Nt 1  MSLE = (log10 (yi ) − log10 (yi ))2 Nt i=1 1 Bias = 10 Nt MAE = 10 R2 = 1 − 1 Nt Nt i )) i=1 (log10 (yi )−log10 (y  Nt i=1 | log10 (yi )−log10 (yi )| Nt i=1 (log10 (yi ) Nt i=1 (log10 (yi ) − log10 (yi ))2 (7) (8) (9) (10) (11) (12) (13) − log10 (y¯i ))2 where yi is the predicted and yi is the corresponding groundtruth Chl-a concentration, Nt is the number of test samples, and ȳi = N1 N i=1 yi is the mean Chl-a value in the test dataset. A bias of 1.5 implies that Chl-a estimations are, on average, 50% larger than the actual measurements [71]. VI. RESULTS AND DISCUSSION The performance statistics on Chl-a estimation are computed for three different configurations each including surface chlorophyll, [Chl-a]surf , and depth-integrated chlorophyll, [Chla]Zpd , estimation. These three configurations include one-toone match-ups, one-to-window match-ups, and one-to-median match-ups. The median Rrs value for each band is computed by taking median over all the valid pixels in a 3 × 3 window [8], [34], [38]. A. Performance Evaluation In most of these experiments, the proposed OCN has consistently shown best performance over all indicators compared to the band ratio, the modified OC3 [45], OC3LT , and the other ML methods as illustrated in Table IV. For the estimation of [Chl-a]surf in one-to-one configuration, OC3LT has achieved minimum MSE and RMSE (Table IV). However, the remaining performance indicators, which are in log scale, indicate that OCN performs better than GPR, OC3LT , and band ratio methods. Also, in estimating [Chl-a]Zpd , MSE and RMSE show that the OC3LT algorithm is the second best performer; however, the remaining indicators do not show favorable results for OC3LT . In the one-to-window configuration, the locally trained ML methods, OCN and GPR, are top performers in estimating both [Chl-a]surf and [Chl-a]Zpd , due to the increased number of match-ups. The scatter-plots in Figs. 6 and B.1–B.5 (Appendix B), further indicate that the globally trained OC3 and C2RCC-net lead to significant overestimation. It should be noted that in these methods Chl-a estimation exceeds 25 mg/m3 while the in situ Chl-a does not exceed 10.81 mg/m3 . In contrast, the band ratio algorithms have shown underestimation. The ML-based models, OCN and GPR, and the locally trained OC3LT , are the leading performers in all the configurations. Though, OCN has outperformed GPR and OC3LT by significant margin. Furthermore, the slope between the in situ Chl-a and predicted Chl-a in log-scale indicates that the relationship is close to unity (>90) compared with the other empirical and ML-based methods. In our experiments, the proposed OCN has achieved the best fit across the entire range of Chl-a concentration. The other performance indicators as listed in Table IV also show the same trend . It should be noted that the performance of most of the compared methods has improved by the proposed depth integration, compared to the surface Chl-a estimations. For the case of one-to-one match-ups using OCN, the R2 value increased from 0.579 to 0.65, while MSE decreased from 2.36 to 1.42. For GPR, the R2 value increased from 0.50 to 0.56, while MSE decreased from 2.296 to 2.115. A similar trend can be observed in most of the compared methods that demonstrates the significance of using the depth integration approach. Also, we observed that OCN’s performance improvement is more significant than the other compared methods because of its capability to learn the nonlinear mapping of Rrs into [Chl-a]Zpd . Significant enhancement can also be observed in most of the compared methods by using the proposed one-to-window ASIM et al.: IMPROVING CHL-A ESTIMATION FROM SENTINEL-2 (MSI) IN THE BARENTS SEA USING ML 5541 Fig. 6. Performance evaluation of [Chl-a]Zpd retrievals by the one-to-window approach using the OCN, C2RCC-net, GPR, band ratios, and OC3 algorithms. The total number of test samples are 78. match-ups configuration. A comparison of one-to-one matchup with one-to-window match-up depicts a decrease in MSLE of OCN from 0.070 to 0.025 for [Chl-a]surf . In case of OC3LT , MSLE decreased from 0.078 to 0.065. A similar trend can be observed in most of the other compared methods because the window approach has leveraged from more data compared to the one-to-one configuration. The one-to-window approach also compensates for the location estimation errors between the in situ measurements and the satellite data. It may also handle the Chl-a transport due to the time gap between the in situ measurements and the satellite data to some extent. The combination of depth integration approach with one-towindow configuration yields the benefits of both approaches. In this case, all the compared methods have achieved their best performance compared to the previous experiments as reported in Table IV. In this configuration, OCN’s performance with R2 = 0.88, MAE < 28%, and MSLE = 0.018, which is not only better than its performance in previous configurations but also better than all of the compared methods. The nearest competitor GPR has obtained R2 and MSLE of 0.82 and 0.026. These results demonstrate that not only the depth integration and the windowbased estimation have individually improved Chl-a estimation but also their combination yields a more significant performance boost to all the compared methods. Thus one may conclude that the proposed improvements are generic and would help enhance the Chl-a estimation methods. We have also included an additional configuration in our experiments: One-to-median match-ups, which has been previously used in [8], [34], and [38]. We observe that the performance in this configuration is similar to the one-to-one configuration. Compared to the one-to-window configuration, the one-to-median results are lower both in case of [Chl-a]surf and [Chl-a]Zpd . These experiments demonstrate that our proposed window approach is better than the previously used match-up approaches due to the higher number of training and validation samples. In case if there is adequate training data, the proposed one-to-window approach is still expected to perform better than one-to-one configuration in open ocean waters, however, it needs to be analyzed on different water types. In the current study we have observed that the erroneous Rrs spectra with in a window of 3 × 3 pixels are due to higher time-gaps between the in situ and satellite data and ambiguities in the Rrs product in the blue and green bands caused by uncertainties in the AC [37]. However, in highly dynamic inland and coastal waters, where large temporal and spatial variability in Chl-a concentrations may exist [45], the window approach is recommended with modified filtering criterion, for example [36], so that the realistic Rrs spectra are not filtered. B. Analyzing the Filtering Criterion To further explore the filtering criterion discussed in Section IV-B, we have changed the ratio threshold and computed the performance indicators for the comparison between the compared methods. We experimentally observe that in many cases when Chl-a content is <1 mg/m3 , the Rrs spectrum peaks at the blue wavelength ant it tends to shift toward the green region of spectrum for Chl-a concentration 1 mg/m3 . However, in some cases peak of Rrs spectra may vary from this observation when Chl-a ranges from 1–1.5 mg/m3 . Therefore, in Table V, we have made a comparison between different methods by varying the 5542 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 14, 2021 Fig. 7. MSI-derived Chl-a products estimated using OCN, C2RCC-net, OC3, Ratio-1 and Ratio-2 algorithms for near-coincident overpasses of Sentinel-2 A on May 5th, 2017.The marked location (circle) represents in situ measurement of Chl-a, and reported as as 4.27 mg/m3 ). threshold in Section IV-B. The four different cases are shown below: Case i  Rrs (λ560nm ) If Chl-a < 1.25 mg/m3 then R < 1.25 rs (λ492nm ) Rrs (λ560nm ) If Chl-a ≥ 1.25 mg/m3 then R ≥ 1.25 rs (λ492nm ) Case ii  Rrs (λ560nm ) If Chl-a < 1.5 mg/m3 then R < 1.5 rs (λ492nm ) If Chl-a ≥ 1.5 mg/m3 then Case iii  If Chl-a < 1.5 mg/m3 then If Chl-a ≥ 1 mg/m3 then Case iv { No Filtering . Rrs (λ560nm ) Rrs (λ492nm ) ≥ 1.5 Rrs (λ560nm ) Rrs (λ492nm ) < Rrs (λ560nm ) Rrs (λ492nm ) ≥ 1 1.5 As evidenced by Table V, the performance of all the methods, including OCN, degrades after changing the filtering threshold; however, OCN degrades more gracefully compared to other methods and maintains its top position. It may be noted that in Case iv, without filtering, the performance of all methods have observed maximum degradation, for example, R2 reduces from 0.88 to 0.51 in OCN. In Case i –Case iii, the size of match-ups increases by varying the threshold; however, a gradual decrease has been seen in the performance of all compared methods. For OCN, the MSLE and RMSE increased from 0.018 and 0.134 (Table IV) to 0.023 and 0.150 in Case i (Table V). Most performance indicators show almost the same results in Case i and Case ii. However, an increment of 38% and 18% is seen in the RMSLE and MSLE in Case iiiwhich indicate degraded performance in this experiment. These experiments confirm the effectiveness of the proposed threshold of 1.00 in the filtering criterion in Section IV-B. C. Spatial Maps To confirm the reliability of the OCN model, the proposed approach is demonstrated for producing Chl-a maps in the Barents Sea. The Sentinel-2 A TOA Rrs images were compensated for atmospheric effects using C2RCC-net. For demonstration purposes, visual intercomparisons of Chl-a maps produced by OCN are done with the maps retrieved via C2RCC-net, band ratio methods, and OC3. Fig. 7 illustrates MSI-derived Chl-a products in the bloom season on May 5th, 2017 generated from the nearest available cloud-free observation made by Sentinel-2 A to the in situ measurement. All the algorithms have captured the spatial variability of Chl-a, however, they provide different Chl-a retrievals. For example, OCN produces Chl-a products ranging from 0.3 to 7 mg/m3 , whereas, C2RCC-net and OC3 have overestimated Chl-a, and the band-ratio algorithms estimation does not exceed 3 mg/m3 . The in situ measurement at the marked location has reported Chl-a = 4.27 mg/m3 . Amongst ASIM et al.: IMPROVING CHL-A ESTIMATION FROM SENTINEL-2 (MSI) IN THE BARENTS SEA USING ML 5543 Fig. 8. MSI-derived Chl-a products estimated using OCN, C2RCC-net, OC3, Ratio-1 and Ratio-2 algorithms for near-coincident overpasses of Sentinel-2B on April 5th, 2018. The marked locations (circle and triangle) represents in situ measurements reported as 4.9 and 6.14 mg/m3 ). The pixels with no-data and flagged as Cloudy are represented by white color. The TOA MSI image was processed to Rrs using C2RCC-net. the mentioned algorithms, OCN estimates are closest to the in situ concentration reported as 3.48 mg/m3 followed by OC3, where estimated Chl-a = 3.02 mg/m3 . The band ratio-1 and ratio-2 algorithms retrieval is underestimated and indicated by 1.62 and 1.15 mg/m3 . The C2RCC-net also underestimates by 400% and reports 1.02 mg/m3 . Besides, we examine the performance of the proposed OCN on another Sentinel-2B observation generated on April 5th, 2018, in the bloom season, as shown in Fig. 8. From the OCN map, it can be inferred that the proposed model has accurately captured the fine details and abrupt changes in Chl-a distribution. It can be seen that the OCN model successfully produces Chl-a products ranging from 1 to 14 mg/m3 . The estimated Chl-a content by C2RCC-net and OC3 exceeds 30 mg/m3 , which is significantly above the in situ observations, indicating overestimation of Chl-a concentrations. The two band-ratio algorithms underestimate the Chl-a concentrations, where the maximum estimated Chl-a is <5 mg/m3 . The Chl-a product produced by the OCN model within the Chl-a ≤ 14 mg/m3 ) range and shows a better correlation with the in situ Chl-a concentration. For example, the in situ observations of Chl-a reported as 4.9 and 6.14 mg/m3 at the marked locations, are closely estimated by OCN, i.e., 4.74 and 4.89 mg/m3 and OC3, i.e., 4.72 and 7.57 mg/m3 , respectively. The OCN and OC3 estimates are quite close to each other, however, OCN predictions are slightly better. While these are underestimated by C2RCC-net and band-ratio algorithms. The C2RCC-net predicts 1.41 and 5.64 and the band-ratio algorithms estimates are quite close to each other. The ratio-1 estimates 1.64 and 1.97 and the ratio-2 estimates 1.31 and 1.53 mg/m3 . These experiments demonstrate that the OCN model has generated reliable Chl-a products. D. Limitations of the Proposed Approach The performance of an ML-based model depends on the representativeness of the training dataset. The proposed OCN model is regionally tuned for the Barents Sea. Compared to other Chl-a datasets collected in lakes, inland, and coastal waters [8], [72] covering different water types, the current dataset is limited to Chl-a measurements from the Barents Sea and some region of the Norwegian Sea. Like other ML algorithms, the accuracy of OCN depends on the distribution and uncertainties in the field data. In addition, considering the revisit time of Sentinel-2 MSI and cloud coverage in the high north, our current match-up dataset does not contain adequate training samples from the coastal areas of Svalbard region. However, the training dataset may be extended by using the Landsat-8 and Sentinel-2 MSI virtual constellation product which can achieve improved coverage with reduced revisit time [73]. 5544 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 14, 2021 The proposed one-to-window match-up approach has significantly improved the estimation of OCN, however, if the variation of Rrs within a window is large, it may adversely effect the learning and the estimation process. To handle this issue, we have restrained the window size to 3 × 3 pixels in our experiments. The proposed match-up criterion is based on the C2RCC derived Rrs . The performance of the proposed algorithm in estimating Chl-a is effected by the uncertainties in the AC process [8]. Although we experimentally proved that the proposed filtering and window approaches have improved the performance of OCN and the compared algorithms in open ocean waters, the uncertainties shown in Table IV indicate the need for further improvement in OCN estimation performance. This may be achieved by extending the dataset and simultaneously estimating other in-water parameters such as total suspended matter (TSM) and color dissolved organic matter (CDOM). Learning simultaneous mapping of Rrs to these quantities will improve Chl-a estimation and will further straighten the proposed filtering approach. VII. CONCLUSION AND FUTURE WORK This work aims at improving the estimation of phytoplankton biomass using optical remote sensing integrated with ML techniques over the lately changing Barents Sea. In situ Chl-a measurements were collected from the year 2016 to 2018 over a wide area of the Barents Sea and Norwegian Coast. Different match-up dataset creation methods are proposed that exploit the pigment content information at surface as well as within the productive column. Surface and depth-integrated Chl-a concentrations are matched with the nearest pixel/window in the satellite image. A filtering criterion based on Rrs spectral distribution is also proposed that allows a larger time-gap between in situ and satellite observations and removes outliers. A NN dubbed as OCN is applied to the inverse problem of estimating Chl-a from Rrs extracted from C2RCC-net for Sentinel-2 (MSI) observations. Using the coincident in situ and Rrs observations, the proposed OCN model is trained, validated, and compared against state-of-the-art approaches, including locally trained GPR and OC3LT , globally trained C2RCC, and the empirical methods OC3 and spectral band ratios. Our experiments demonstrate that the proposed OCN is a promising Chl-a retrieval method, and it has performed favorably compared to the existing state-of-the-art methods. The blue and green bands are found more sensitive compared to the red and NIR bands for predicting Chl-a in the Barents Sea. The proposed match-up dataset creation algorithm is generic and it has significantly improved the performance of the OCN and other compared techniques. The R-score and R2 between in situ measurements and the estimated Chl-a using the proposed OCN are highest while the MSE and RMSE are the lowest among the compared methods. Moreover, the proposed OCN model exhibits the best performance in different match-ups configurations. The obtained results demonstrate the potential of the proposed approach in producing reliable Chl-a products. As evidenced through the spatial maps, the proposed OCN produces more realistic Chl-a map products by accurately capturing the fine details and abrupt changes in Chl-a distribution. Future directions include validation and expansion of OCN on Rrs products by various AC algorithms from different satellites as well as collection of in situ Chl-a data, including the in situ Rrs from the northern Barents Sea in the marginal ice zone. Moreover, the in situ Chl-a dataset will also be extended through collaboration with IMR, Norway. The OCN implementation will also be extended to simultaneously estimate other various in-water parameters of interest, such as TSM and CDOM. CONTRIBUTION Muhammad Asim: Conceptualization, methodology, experimental work, software, validation of results, and writing of original draft. Camilla Brekke: Guidance, review of main methodology, designing, reviewing, and editing of the manuscript. Arif Mahmood: Experimental setup, writing, and editing of the manuscript. Torbjørn Eltoft: Review of approach, design, and editing of the article. Marit Reigstad: Provided ideas/guidance, review, and editing. APPENDIX A The OC3LT Chl-a retrieval algorithm [45] is given by (A.1). The value of x is configured with respect to the MSI sensor x = log10 [(max[Rrs (443), Rrs (493)]) ∗ Rrs (560)−1 ] y = a0 + a 1 x + a 2 x 2 + a 3 x 3 + a 4 x 4 OC3LT = 10y . (A.1) The values of coefficients of the polynomial expression are computed by minimization of sum of Least Error Squares for each split (k-fold) using the training data only Y = Xa a = (X T X)−1 X T Y. The globally trained OC3 Chl-a retrieval algorithm [45] is given by (A.1). The coefficients are adopted from the previous study [8] y = 0.3308 − 2.6684x + 1.599x2 + 0.5525x3 − 1.4876x4 OC3 = 10y . APPENDIX B This section contain results using different settings discussed in Section IV. ASIM et al.: IMPROVING CHL-A ESTIMATION FROM SENTINEL-2 (MSI) IN THE BARENTS SEA USING ML 5545 Fig. B.1. Performance evaluation of surface Chl-a retrievals using OCN, C2RCC-net, GPR, band ratios, and OC3 algorithms using one-to-one (central pixel) configuration. The total number of test samples are 52. The overall and range-specific performances are included in Table IV, respectively. Fig. B.2. Performance evaluation of [Chl-a]Zpd retrievals using OCN, C2RCC-net, GPR, band ratios, and OC3 algorithms using one-to-one (central pixel) configuration. The total number of test samples are 53. 5546 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 14, 2021 Fig. B.3. Performance evaluation of surface Chl-a retrievals using OCN, C2RCC-net, GPR, band ratios, and OC3 algorithms using one-to-one (median of 3×3 pixels) configuration. The total number of test samples are 59. Fig. B.4. Performance evaluation of [Chl-a]Zpd retrievals using OCN, C2RCC-net, GPR, band ratios, and OC3 algorithms using one-to-one ((median of 3×3 pixels) configuration. The number of test samples are 62. ASIM et al.: IMPROVING CHL-A ESTIMATION FROM SENTINEL-2 (MSI) IN THE BARENTS SEA USING ML 5547 Fig. B.5. Performance evaluation of surface Chl-a retrievals using OCN, C2RCC-net, GPR, band ratios, and OC3 algorithms using one-to-window approach. The number of test samples are 75. ACKNOWLEDGMENT The authors would like to thank the Institute of Marine Research, Norway for providing them the in situ Chl-a data. REFERENCES [1] V. Le Fouest, C. Postlethwaite, M. A. M. Maqueda, S. Belanger, and M. Babin, “On the role of tides and strong wind events in promoting summer primary production in the Barents sea,” Continental Shelf Res., vol. 31, no. 17, pp. 1869–1879, 2011. [2] H. Loeng, “Features of the physical oceanographic conditions of the Barents sea,” Polar Res., vol. 10, no. 1, pp. 5–18, 1991. [3] E. Sakshaug, “Primary and secondary production in the Arctic seas,” in Proc. Organic Carbon Cycle Arctic Ocean., 2004, pp. 57–81. [4] M. Årthun, T. Eldevik, L. Smedsrud, Ø. Skagseth, and R. Ingvaldsen, “Quantifying the influence of Atlantic heat on Barents sea ice variability and retreat,” J. Climate, vol. 25, no. 13, pp. 4736–4743, 2012. [5] T. Haug et al. “Future harvest of living resources in the Arctic ocean north of the Nordic and Barents seas: A review of possibilities and constraints,” Fisheries Res., vol. 188, pp. 38–57, 2017. [6] P. Dalpadado et al., “Climate effects on temporal and spatial dynamics of phytoplankton and zooplankton in the Barents sea,” Prog. Oceanogr., vol. 185, 2020, Art. no. 102320. [7] A. Morel, H. Claustre, D. Antoine, and B. Gentili, “Natural variability of bio-optical properties in case 1 waters: Attenuation and reflectance within the visible and near-UV spectral domains, as observed in South Pacific and Mediterranean waters,” Biogeosciences, vol. 4, no. 5, pp. 913–925, 2007. [8] N. Pahlevan et al. “Seamless retrievals of chlorophyll-a from Sentinel2 (MSI) and Sentinel-3 (OLCI) in inland and coastal waters: A machine-learning approach,” Remote Sens. Environ., vol. 240, 2020, Art. no. 111604. [9] M. Asim, C. Brekke, A. Mahmood, T. Eltoft, and M. Reigstad, “Ocean color net (OCN) for the Barents sea,” in Proc. IGARSS 2020-2020 IEEE Int. Geosci. Remote Sens. Symp., 2020, pp. 5881–5884. [10] C. J. Gobler, “Climate change and harmful algal blooms: Insights and perspective,” Harmful Algae, vol. 91, 2020, Art. no. 101731. [11] E. N. Hegseth, “Phytoplankton of the Barents sea-the end of a growth season,” Polar Biol., vol. 17, no. 3, pp. 235–241, 1997. [12] H. Hodal and S. Kristiansen, “The importance of small-celled phytoplankton in spring blooms at the marginal ice zone in the northern Barents sea,” Deep Sea Res. Part II: Topical Stud. Oceanogr., vol. 55, no. 20-21, pp. 2176–2185, 2008. [13] J. Holt et al., “Potential impacts of climate change on the primary production of regional seas: A comparative analysis of five European seas,” Prog. Oceanogr., vol. 140, pp. 91–115, 2016. [14] M. Reigstad, P. Wassmann, C. W. Riser, S. Øygarden, and F. Rey, “Variations in hydrography, nutrients and chlorophyll a in the marginal ice-zone and the central Barents sea,” J. Mar. Syst., vol. 38, no. 1-2, pp. 9–29, 2002. [15] P. Wassmann et al., “Food webs and carbon flux in the Barents sea,” Prog. Oceanogr., vol. 71, no. 2-4, pp. 232–287, 2006. [16] O. V. Kopelevich, V. I. Burenkov, and S. V. Sheberstov, “Case studies of optical remote sensing in the Barents sea, Black sea and Caspian sea,” in Proc. Remote Sens. Eur. Seas, 2008, pp. 53–66. [17] O. Engelsen, E. N. Hegseth, H. Hop, E. Hansen, and S. Falk-Petersen, “Spatial variability of chlorophyll-a in the marginal ice zone of the barents sea, with relations to sea ice and oceanographic conditions,” J. Mar. Syst., vol. 35, no. 1/2, pp. 79–97, 2002. [18] J. Kogeler and F. Rey, “Ocean colour and the spatial and seasonal distribution of phytoplankton in the Barents sea,” Int. J. Remote Sens., vol. 20, no. 7, pp. 1303–1318, 1999. [19] K. R. Arrigo, P. A. Matrai, and G. L. Van Dijken, “Primary productivity in the Arctic ocean: Impacts of complex optical properties and subsurface chlorophyll maxima on large-scale estimates,” J. Geophysical Res.: Oceans, vol. 116, no. C11, 2011. [20] V. J. Hill et al., “Synthesis of integrated primary production in the arctic ocean: II. In situ and remotely sensed estimates,” Prog. Oceanogr., vol. 110, pp. 107–125, 2013. [21] M. Ardyna, M. Babin, M. Gosselin, E. Devred, L. Rainville, and J.-É. Tremblay, “Recent Arctic ocean sea ice loss triggers novel fall phytoplankton blooms,” Geophysical Res. Lett., vol. 41, no. 17, pp. 6207–6212, 2014. [22] K. R. Arrigo and G. L. van Dijken, “Continued increases in arctic ocean primary production,” Prog. Oceanogr., vol. 136, pp. 60–70, 2015. [23] I. Kostakis et al., “Development of a bio-optical model for the barents sea to quantitatively link glider and satellite observations,” Philos. Trans. Roy. Soc. A, vol. 378, no. 2181, 2020, Art. no. 20190367. [24] L. E. Keiner and X.-H. Yan, “A neural network model for estimating sea surface chlorophyll and sediments from thematic mapper imagery,” Remote Sens. Environ., vol. 66, no. 2, pp. 153–165, 1998. 5548 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 14, 2021 [25] G. Camps-Valls, L. Bruzzone, J. L. Rojo-Alvarez, and F. Melgani, “Robust support vector regression for biophysical variable estimation from remotely sensed images,” IEEE Geosci. Remote Sens. Lett., vol. 3, no. 3, pp. 339–343, Jul. 2006. [26] G. Camps-Valls, L. Gómez-Chova, J. Muñoz-Marí, J. Vila-Francés, J. Amorós-López, and J. Calpe-Maravilla, “Retrieval of oceanic chlorophyll concentration with relevance vector machines,” Remote Sens. Environ., vol. 105, no. 1, pp. 23–33, 2006. [27] P. M. Maier, S. Hinz, and S. Keller, “Estimation of chlorophyll a, diatoms and green algae based on hyperspectral data with machine learning approaches,” Tagungsband der, vol. 27, pp. 49–57, 2018. [28] L. Pasolli, F. Melgani, and E. Blanzieri, “Gaussian process regression for estimating chlorophyll concentration in subsurface waters from remote sensing data,” IEEE Geosci. Remote Sens. Lett., vol. 7, no. 3, pp. 464–468, Jul. 2010. [29] K. Blix, J. Li, P. Massicotte, and A. Matsuoka, “Developing a new machinelearning algorithm for estimating chlorophyll-a concentration in optically complex waters: A case study for high northern latitude waters by using Sentinel 3 OLCI,” Remote Sens., vol. 11, no. 18, 2019, Art. no. 2076. [30] S. Hafeez et al., “Comparison of machine learning algorithms for retrieval of water quality indicators in case-II waters: A case study of hong kong,” Remote Sens., vol. 11, no. 6, pp. 617–640, 2019. [31] J. Verrelst et al., “Machine learning regression algorithms for biophysical parameter retrieval: Opportunities for Sentinel-2 and-3,” Remote Sens. Environ., vol. 118, pp. 127–139, 2012. [32] J. He, Y. Chen, J. Wu, D. A. Stow, and G. Christakos, “Space-time chlorophyll-a retrieval in optically complex waters that accounts for remote sensing and modeling uncertainties and improves remote estimation accuracy,” Water Res., vol. 171, 2020, Art. no. 115403. [33] J. Uitz, H. Claustre, A. Morel, and S. B. Hooker, “Vertical distribution of phytoplankton communities in open ocean: An assessment based on surface chlorophyll,” J. Geophysical Res., Oceans, vol. 111, no. C8, 2006. [34] L. G. Vilas, E. Spyrakos, and J. M. T. Palenzuela, “Neural network estimation of chlorophyll a from MERIS full resolution data for the coastal waters of Galician rias (NW Spain),” Remote Sens. Environ., vol. 115, no. 2, pp. 524–535, 2011. [35] D. Zhang, S. Lavender, J.-P. Muller, D. Walton, B. Karlson, and J. Kronsell, “Determination of phytoplankton abundances (chlorophyll-a) in the optically complex inland water-the Baltic sea,” Sci. Total Environ., vol. 601, pp. 1060–1074, 2017. [36] S. W. Bailey and P. J. Werdell, “A multi-sensor approach for the on-orbit validation of ocean color satellite data products,” Remote Sens. Environ., vol. 102, no. 1-2, pp. 12–23, 2006. [37] M. A. Warren et al., “Assessment of atmospheric correction algorithms for the Sentinel-2A multispectral imager over coastal and inland waters,” Remote Sens. Environ., vol. 225, pp. 267–289, 2019. [38] M. Pereira-Sandoval et al., “Evaluation of atmospheric correction algorithms over spanish inland waters for Sentinel-2 multi spectral imagery data,” Remote Sens., vol. 11, no. 12, 2019, Art. no. 1469. [39] R. Smith and K. Baker, “Oceanic chlorophyll concentrations as determined by satellite (Nimbus-7 coastal zone color scanner),” Mar. Biol., vol. 66, no. 3, pp. 269–279, 1982. [40] H. Loisel and A. Morel, “Light scattering and chlorophyll concentration in case 1 waters: A reexamination,” Limnol. Oceanogr., vol. 43, no. 5, pp. 847–858, 1998. [41] H. R. Gordon, D. K. Clark, J. W. Brown, O. B. Brown, R. H. Evans, and W. W. Broenkow, “Phytoplankton pigment concentrations in the middle atlantic bight: Comparison of ship determinations and CZCS estimates,” Appl. Opt., vol. 22, no. 1, pp. 20–36, 1983. [42] S. Jain and J. Miller, “Subsurface water parameters: Optimization approach to their determination from remotely sensed water color data,” Appl. Opt., vol. 15, no. 4, pp. 886–890, 1976. [43] C. S. Yentsch, “The influence of phytoplankton pigments on the colour of sea water,” Deep Sea Res. (1953), vol. 7, no. 1, pp. 1–9, 1960. [44] J. E. O’Reilly et al., “Ocean color chlorophyll algorithms for seawifs,” J. Geophysical Res., Oceans, vol. 103, no. C11, pp. 24 937–24 953, 1998. [45] J. E. O’Reilly and P. J. Werdell, “Chlorophyll algorithms for ocean color sensors-OC4, OC5 & OC6,” Remote Sens. Environ., vol. 229, pp. 32–47, 2019. [46] G. Liu et al. “An OLCI-based algorithm for semi-empirically partitioning absorption coefficient and estimating chlorophyll a concentration in various turbid case-2 waters,” Remote Sens. Environ., vol. 239, 2020, Art. no. 111648. [47] K. G. Ruddick, H. J. Gons, M. Rijkeboer, and G. Tilstone, “Optical remote sensing of chlorophyll a in case 2 waters by use of an adaptive two-band algorithm with optimal error properties,” Appl. Opt., vol. 40, no. 21, pp. 3575–3585, 2001. [48] A. A. Gitelson et al., “A simple semi-analytical model for remote estimation of chlorophyll-a in turbid waters: Validation,” Remote Sens. Environ., vol. 112, no. 9, pp. 3582–3593, 2008. [49] C. Le, C. Hu, D. English, J. Cannizzaro, and C. Kovach, “Climate-driven chlorophyll-a changes in a turbid estuary: Observations from satellites and implications for management,” Remote Sens. Environ., vol. 130, pp. 11–24, 2013. [50] X.-G. Xing, D.-Z. Zhao, Y.-G. Liu, J.-H. Yang, P. Xiu, and L. Wang, “An overview of remote sensing of chlorophyll fluorescence,” Ocean. Sci. J., vol. 42, no. 1, pp. 49–59, 2007. [51] M. W. Matthews and D. Odermatt, “Improved algorithm for routine monitoring of cyanobacteria and eutrophication in inland and near-coastal waters,” Remote Sens. Environ., vol. 156, pp. 374–382, 2015. [52] M. E. Smith, L. R. Lain, and S. Bernard, “An optimized chlorophyll a switching algorithm for meris and OLCI in phytoplankton-dominated waters,” Remote Sens. Environ., vol. 215, pp. 217–227, 2018. [53] V. Piermattei et al., “Cost-effective technologies to study the Arctic ocean environment,” Sensors, vol. 18, no. 7, 2018, Art. no. 2257. [54] Y. Fan et al., “Atmospheric correction over coastal waters using multilayer neural networks,” Remote Sens. Environ., vol. 199, pp. 218–240, 2017. [55] F. Pu, C. Ding, Z. Chao, Y. Yu, and X. Xu, “Water-quality classification of inland lakes using Landsat8 images by convolutional neural networks,” Remote Sens., vol. 11, no. 14, 2019, Art. no. 1674. [56] J. Pyo et al., “A convolutional neural network regression for quantifying cyanobacteria using hyperspectral imagery,” Remote Sens. Environ., vol. 233, 2019, Art. no. 111350. [57] N. Pahlevan, S. Sarkar, B. Franz, S. Balasubramanian, and J. He, “Sentinel2 multispectral instrument (MSI) data processing for aquatic science applications: Demonstrations and validations,” Remote Sens. Environ., vol. 201, pp. 47–56, 2017. [58] Q. Wang and P. M. Atkinson, “Spatio-temporal fusion for daily Sentinel-2 images,” Remote Sens. Environ., vol. 204, pp. 31–42, 2018. [59] M. Drusch et al., “Sentinel-2: ESA’s optical high-resolution mission for gmes operational services,” Remote Sens. Environ., vol. 120, pp. 25–36, 2012. [60] R. Coluzzi, V. Imbrenda, M. Lanfredi, and T. Simoniello, “A first assessment of the Sentinel-2 level 1-C cloud mask product to support informed surface analyses,” Remote Sens. Environ., vol. 217, pp. 426–443, 2018. [61] C. Kuhn et al., “Performance of Landsat-8 and Sentinel-2 surface reflectance products for river remote sensing retrievals of chlorophyll-a and turbidity,” Remote Sens. Environ., vol. 224, pp. 104–118, 2019. [62] X. Pan, A. Mannino, M. E. Russ, and S. B. Hooker, “Remote sensing of the absorption coefficients and chlorophyll a concentration in the united states southern middle Atlantic bight from seaWiFS and MODIS-aqua,” J. Geophysical Res.: Oceans, vol. 113, no. C11, 2008. [63] C. D. Mobley, Light and Water: Radiative Transfer in Natural Waters. New York, NY, USA: Academic Press, 1994. [64] S. Bernard et al., Earth observations in support of global water quality monitoring, SER. reports and monographs of the international ocean colour coordinating group, Int. Ocean-Colour Coordinating Group, Canada, vol. IOCCG Report 17, no. 17, 2018. [65] A. Morel and J.-F. Berthon, “Surface pigments, algal biomass profiles, and potential production of the Euphotic layer: Relationships reinvestigated in view of remote-sensing applications,” Limnol. Oceanogr., vol. 34, no. 8, pp. 1545–1562, 1989. [66] N. Almaadeed, M. Asim, S. Al-Maadeed, A. Bouridane, and A. Beghdadi, “Automatic detection and classification of audio events for road surveillance applications,” Sensors, vol. 18, no. 6, 2018, Art. no. 1858. [67] S. Javed et al., “Cellular community detection for tissue phenotyping in colorectal cancer histology images,” Med. Image Anal., 2020, Art. no. 101696. [68] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” 2015, arXiv:1502.03167. [69] X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in Proc. 13th Int. Conf. Artif. Intell. Statist., 2010, pp. 249–256. [70] F. Pedregosa et al., “Scikit-learn: Machine learning in Python,” J. Mach. Learn. Res., vol. 12, pp. 2825–2830, 2011. [71] B. N. Seegers, R. P. Stumpf, B. A. Schaeffer, K. A. Loftin, and P. J. Werdell, “Performance metrics for the assessment of satellite data products: An ocean color case study,” Opt. Exp., vol. 26, no. 6, pp. 7404–7422, 2018. ASIM et al.: IMPROVING CHL-A ESTIMATION FROM SENTINEL-2 (MSI) IN THE BARENTS SEA USING ML [72] E. Spyrakos et al. “Optical types of inland and coastal waters,” Limnol. Oceanogr., vol. 63, no. 2, pp. 846–870, 2018. [73] J. Li and D. P. Roy, “A global analysis of Sentinel-2A, Sentinel-2B and Landsat-8 data revisit intervals and implications for terrestrial monitoring,” Remote Sens., vol. 9, no. 9, pp. 902–919, 2017, Art. no. 902. Muhammad Asim received the bachelor’s degree in electronics from the Department of Electrical Engineering, Comsats University, Abbottabad, Pakistan, in October 2010, and the master’s degree in electrical engineering with emphases on signal processing from the Department of Electrical Engineering, Blekinge Institute of Technology, Sweden, in February 2013. He is currently working toward the Ph.D. degree in remote sensing with the Center for Integrated Remote Sensing and Forecasting for Arctic Operations (CIRFA), Department of Physics and Technology, UiT The Arctic University of Norway, Tromsø, Norway. He worked as a Research Assistant on different projects with the Qatar University, Qatar including audio processing, biomedical signal processing, and computer vision. His research interest includes optical remote sensing of ocean areas, water quality assessment, retrieval of in-water optical properties, and machine learning. Camilla Brekke received the Cand. Mag. (BSc) and Cand. Scient. (MSc), in informatics (computing science), and Ph.D. degrees in remote sensing and image analysis from the Department of Informatics, University of Oslo, Oslo, Norway, in 1998, 2001, and 2008, respectively. She is a Professor with the Department of Physics and Technology, UiT The Arctic University of Norway, Tromsø, Norway. She is currently the Vice-Dean Research with Faculty of Science and Technology and Deputy Centre Leader at Centre for Integrated Remote Sensing and Forecasting for Arctic Operations. Her current research interests include synthetic aperture radar and ocean color remote sensing for Arctic and marine applications. Arif Mahmood is a Professor with the Department of Computer Science, Information Technology University, Lahore, Pakistan. Previously he worked as Research Assistant Professor with the University of Western Australia, Crawley WA, Australia, and a PostDoc Researcher in Qatar University, Doha, Qatar. His research interests include computer vision and machine learning applications. More specifically, he has worked on moving objects detection in videos, visual object tracking, visual object categorization, nucleus detection and tissue phenotyping in cancer histology images, face detection and facial expression synthesis, action detection and recognition, visual crowd analysis, anomalous event detection, human body pose estimation, scale-able spectral clustering, community detection in complex networks, ocean color monitoring using remote sensing, and applications of machine learning in cloud and fog computing. 5549 Torbjorn Eltoft received the M.Sc. degree in 1981 and the Ph.D. degee in 1984. He joined the Faculty of Science and Technology, UiT the Arctic University of Norway, in 1988, where he is employed as a Professor in remote sensing at the Department of Physics and Technology. He is Director of the Centre for Integrated Remote Sensing and Forecasting for Arctic Operations (CIRFA). He has a significant publication record. His research interests include signal and image analysis, statistical modelling, and machine learning with applications in synthetic aperture radar and ocean colour remote sensing. Dr. Eltoft and was the co-recipient of the year 2000 Outstanding Paper Award in Neural Networks awarded by IEEE Neural Networks Council, and of the Honourable Mention for the 2003 Pattern Recognition Journal Best Paper Award. He was the recipient of the “ UiT Award for Research and Development” in 2017. He served as an Associate Editor for the Elsevier journal Pattern Recognition for the period 2005–2011, and was Guest-Editor for the journal Remote Sensing on the Special Issue for the PolInSAR 2017 conference. Marit Reigstad received the master’s degree in marine zoology 1994, and her Ph.D. in marine biology in 2000. She is a Field Biologist, and enjoys interdisciplinary collaboration. She has participated, initiated, and lead several Arctic marine research projects, and is now leading the Norwegian collaborative research project The Nansen Legacy. This project aims to build a multidisciplinary knowledge basis for the changing Arctic marine region in the northern Barents Sea and adjacent Arctic Ocean. The Nansen Legacy includes ten institutions and more than 150 scientists focusing on the eco-and climate system of the Northern Barents Sea and adjacent Arctic Ocean. She is a Professor in marine ecology with the UiT, since 2009. She teaches ecology and marine systems, including supervision on Master, Ph.D. and Post doc level. She has also been involved in developing the Norwegian strategy for the Decade of ocean science for sustainability. She has authored or coauthored 71 peer-reviewed papers. Her research interests includes productivity, plankton and their fate in marine ecosystems, with a special interest for vertical flux and for ice-impacted areas. Dr. Reigstad was a Guest Editor for the special issue CarbonBridge to the Arctic in Frontiers of Marine Science, 2020. She has been involved in international science planning, since 2011. She serves at the Board of Tromsø Forskningsstiftelse, the Scientific Liason Panel of the EU project ARICE, and the scientific advisory board of Chinese-Norwegian research project STRESSOR .