Published: Sept. 20, 2021
This tutorial demonstrates how to work with the HLS Landsat 8 (HLSL30.002) and Sentinel-2 (HLSS30.002) data products in R.
The Harmonized Landsat Sentinel-2 (HLS) project is a NASA initiative aiming to produce a consistent, harmonized surface reflectance product from Landsat 8 and Sentinel-2 data acquired by the Operational Land Imager (OLI) and Multi-Spectral Instrument (MSI) aboard Landsat 8 and Sentinel-2 satellites, respectively. Using sets of algorithms, all the necessary radiometric, spectral, geometric, and spatial corrections have been applied to make HLS into seamless timeseries that are stackable and comparable. Dense timeseries of HLS are creating unique and exciting opportunities to monitor, map dynamics in land surface properties with unprecedented spatial detail. Numerous studies such as land cover change, agricultural management, disaster response, water resources, and vegetation phenology will vastly benefit from higher temporal resolution of this dataset.
NASA’s Land Processes Distributed Active Archive Center (LP DAAC) archives and distributes HLS products in the LP DAAC Cumulus cloud archive as Cloud Optimized GeoTIFFs (COG). The primary objective of this tutorial is to show how to query and subset HLS data using the NASA CMR-STAC application programming interface (API). Using cloud hosted publicly available archive from LP DAAC Cumulus cloud, you will not need to download HLS source data files for your research needs anymore. STAC allows you to subset your desired data collection by region, time, and band. Therefore, you will only download the data you really want.t.
In this tutorial, a case study is used to show how to process (calculate NDVI and quality filtering), visualize, and calculate statistics for an NDVI time series derived from HLS data over a region of interest. The tutorial also shows how to export the statistics so you will have all of the information you need for your area of interest without having to download the source data files. We looked at multiple agricultural fields in the Central Valley of California in the United States as an example to show how to interact with HLS data.
Daily 30 meter (m) global HLS Sentinel-2 Multi-spectral Instrument Surface Reflectance - HLSS30.002
Daily 30 meter (m) global HLS Landsat-8 OLI Surface Reflectance - HLSL30.002
R and RStudio are required to execute this tutorial. Installation details can be found here.
This tutorial has been tested on Windows using R Version 4.0.5 and RStudio version 1.2.5033.
A NASA Earthdata Login account is required to access the data used in this Tutorial. You can create an account here.
Setting up a netrc file:
This tutorial leverages a netrc file storing your Earthdata login credentials for authentication. This file is assumed to be stored in the user profile directory on Window OS or in the home directory on Mac OS. Run the chunk below (earthdata_netrc_setup.R) to ensure you have the proper netrc file set up. If you are prompted for your NASA Earthdata Login Username and Password, hit enter once you have submitted your credentials. If neither HOME nor Userprofile are recognized by R, the current working directory is used.
-If you want to manually create your own netrc file (_netrc
for Windows OS and .netrc
for Mac OS), download the .netrc
file template, add your credentials, and save to your Userprofile/HOME
directory. You should also make sure that both HOME
directory and Userprofile directories are as same as a directory
you are saving the .netrc/_netrc file. To do that run
Sys.setenv("HOME" = "YOUR DIRECTORY")
and
Sys.setenv("userprofile" = "YOUR DIRECTORY")
.
source("Scripts/earthdata_netrc_setup.R")
Clone or download HLS_Tutorial_R Repository from the LP DAAC Data User Resources Repository.
When you open this Rmarkdown notebook in RStudio, you can click the little green "Play" button in each grey code chunk to execute the code. The result can be printed either in the R Console or inline in the RMarkdown notebook, depending on your RStudio preferences.
version
into the console and RStudio by typing RStudio.Version()
into the console and update them if needed.Windows
Install and load installr:
install.packages("installr");library(installr)
Copy/Update the existing packages to the new R installation:
updateR()
Open RStudio, go to Help > Check for Updates to install newer version of RStudio (if available).
Mac
Required packages:
rgdal
leaflet
jsonlite
raster
sp
rasterVis
httr
ggplot2
RColorBrewer
dygraphs
xts
xml2
lubridate
DT
Run the cell below to identify any missing packages to install, and then load all of the required packages.
packages <- c('leaflet','rgdal','raster','jsonlite','sp','httr','rasterVis','ggplot2','magrittr', 'RColorBrewer','xml2','dygraphs','xts','lubridate','DT','rmarkdown', 'rprojroot')
# Identify missing (not installed) packages
new.packages = packages[!(packages %in% installed.packages()[,"Package"])]
# Install new (not installed) packages
if(length(new.packages)) install.packages(new.packages, repos='http://cran.rstudio.com/') else print('All required packages are installed.')
Next load all packages using library()
function.
invisible(lapply(packages, library, character.only = TRUE))
Create an output directory for the results.
# Create an output directory if it doesn't exist
wd <- rprojroot::find_rstudio_root_file()
outDir <- file.path(wd, "Data", "R_Output", fsep="/")
suppressWarnings(dir.create(outDir))
Next, assign the LPCLOUD STAC Search URL to a static variable.
search_URL <- 'https://cmr.earthdata.nasa.gov/stac/LPCLOUD/search'
We will use the LPCLOUD STAC search endpoint to query the HLS data by region of interest and time period of interest. In order to retrieve STAC Items that match your criteria, you need to define your query parameters, which we will walk through below.
To learn how to navigate through the structure of a CMR-STAC Catalog and define Search parameters, see Getting Started with NASA's CMR-STAC API. Information on the specifications for adding parameters to a search query can be found here.
We will need to assign the lists one or more Collection IDs (Product shortnames) we want to include in the search query to a variable. Only Items in one of the provided Collections will be searched. Here, we are interested in both HLS Landsat-8 and Sentinel-2 collections. To learn how to access Collection IDs in CMR-STAC visit Getting Started with NASA's CMR-STAC API.
A list of one or more Collection IDs (Product shortnames) is assigned to a variable to be included in the search query. Only Items in one of the provided Collections will be searched. Here, we are interested in both HLS Landsat-8 and Sentinel-2 collections. To learn how to access the IDs for the collections in CMR-STAC visit Getting Started with NASA's CMR-STAC API.
HLS_col <- list("HLSS30.v2.0", "HLSL30.v2.0")
We can assign our spatial region of interest by loading a GeoJSON file using the
rgdal
package. An example GeoJSON file is supplied in the 'Data' directory of
the 'hls_tutorial_r' repository named FieldBoundary.geojson
. Read the file in
and use it to perform the spatial subset.
cropland_geojson <- readOGR("Data/FieldBoundary.geojson")
Then we can use the leaflet
package to plot the agricultural fields boundary on top
of a ESRI world imagery basemap.
leaflet() %>%
addPolygons(data = cropland_geojson, fill = FALSE) %>%
addProviderTiles(providers$Esri.WorldImagery) %>%
addMiniMap(zoomLevelFixed = 5)
This map lets us visualize our study area (Region of Interest (ROI)), which is important to
confirm. But when we actually want to search the CMR-STAC for spatial
subsetting, we'll need a parameter called a "bounding box", indicated by
the lower left and upper right coordinates of our study area. Below, we
can use the raster
package to extract the extent of input GeoJSON so
we can use it to create spatial search parameters.
roi <- extent(cropland_geojson)
cropland_bbox <- paste(roi[1], roi[3], roi[2], roi[4], sep = ',')
Next, we will need to define a temporal search query parameter. In our example,
we will set the time period of interest for two months of August and September 2021.
Note that the temporal ranges should be specified as a pair of date-time values
separated by comma (,) or forward slash (/). Each date-time value must have the
format ofYYYY-MM-DDTHH:MM:SSZ
. Additional information on setting temporal
searches can be found in the NASA CMR Documentation.
cropland_datetime <- '2021-08-01T00:00:00Z/2021-09-30T23:59:59Z' # YYYY-MM-DDTHH:MM:SSZ/YYYY-MM-DDTHH:MM:SSZ
Now we are all prepared to submit a request to the CMR-STAC endpoint! We
will do this using the httr
package. The output will show all the
available data matching our search criteria based on our datetime and
bounding box.
cropland_search_body <- list(limit=100,
datetime=cropland_datetime,
bbox= cropland_bbox,
collections= HLS_col)
cropland_search_req <- httr::POST(search_URL, body = cropland_search_body, encode = "json") %>%
httr::content(as = "text") %>%
fromJSON()
cat('There are',cropland_search_req$numberMatched, 'features matched our request.')
There are 16 features matched our request.
Now we can look at the name fields in the output.
names(cropland_search_req)
[1] "type" "stac_version" "numberMatched" "numberReturned"
[5] "features" "links" "context"
Detailed information about the feature in our response can be found in
features
field. Let's select the first feature and view its content.
We can print out the output in a "pretty" json format.
search_features <- cropland_search_req$features
Feature1 <- search_features[1,]
Feature1 %>%
jsonlite::toJSON(auto_unbox = TRUE, pretty = TRUE)
[
{
"type": "Feature",
"id": "HLS.L30.T10SEJ.2021215T185132.v2.0",
"stac_version": "1.0.0",
"stac_extensions": "https://stac-extensions.github.io/eo/v1.0.0/schema.json",
"collection": "HLSL30.v2.0",
"geometry": {
"type": "Polygon",
"coordinates": [
[
[-122.1553, 38.7578],
[-121.8321, 39.7444],
[-123.0002, 39.7503],
[-123.0002, 38.7609],
[-122.1553, 38.7578]
]
]
},
"bbox": [-123.0002, 38.7578, -121.8321, 39.7503],
"links": [
{
"rel": "self",
"href": "https://cmr.earthdata.nasa.gov/stac/LPCLOUD/collections/HLSL30.v2.0/items/HLS.L30.T10SEJ.2021215T185132.v2.0"
},
{
"rel": "parent",
"href": "https://cmr.earthdata.nasa.gov/stac/LPCLOUD/collections/HLSL30.v2.0"
},
{
"rel": "collection",
"href": "https://cmr.earthdata.nasa.gov/stac/LPCLOUD/collections/HLSL30.v2.0"
},
{
"rel": "root",
"href": "https://cmr.earthdata.nasa.gov/stac/"
},
{
"rel": "provider",
"href": "https://cmr.earthdata.nasa.gov/stac/LPCLOUD"
},
{
"rel": "via",
"href": "https://cmr.earthdata.nasa.gov/search/concepts/G2099379244-LPCLOUD.json"
},
{
"rel": "via",
"href": "https://cmr.earthdata.nasa.gov/search/concepts/G2099379244-LPCLOUD.umm_json"
}
],
"properties": {
"datetime": "2021-08-03T18:51:32.308Z",
"start_datetime": "2021-08-03T18:51:32.308Z",
"end_datetime": "2021-08-03T18:51:56.199Z",
"eo:cloud_cover": 0
},
"assets": {
"B11": {
"title": "Download HLS.L30.T10SEJ.2021215T185132.v2.0.B11.tif",
"href": "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10SEJ.2021215T185132.v2.0/HLS.L30.T10SEJ.2021215T185132.v2.0.B11.tif"
},
"B05": {
"title": "Download HLS.L30.T10SEJ.2021215T185132.v2.0.B05.tif",
"href": "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10SEJ.2021215T185132.v2.0/HLS.L30.T10SEJ.2021215T185132.v2.0.B05.tif"
},
"B10": {
"title": "Download HLS.L30.T10SEJ.2021215T185132.v2.0.B10.tif",
"href": "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10SEJ.2021215T185132.v2.0/HLS.L30.T10SEJ.2021215T185132.v2.0.B10.tif"
},
"B03": {
"title": "Download HLS.L30.T10SEJ.2021215T185132.v2.0.B03.tif",
"href": "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10SEJ.2021215T185132.v2.0/HLS.L30.T10SEJ.2021215T185132.v2.0.B03.tif"
},
"B02": {
"title": "Download HLS.L30.T10SEJ.2021215T185132.v2.0.B02.tif",
"href": "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10SEJ.2021215T185132.v2.0/HLS.L30.T10SEJ.2021215T185132.v2.0.B02.tif"
},
"B04": {
"title": "Download HLS.L30.T10SEJ.2021215T185132.v2.0.B04.tif",
"href": "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10SEJ.2021215T185132.v2.0/HLS.L30.T10SEJ.2021215T185132.v2.0.B04.tif"
},
"VZA": {
"title": "Download HLS.L30.T10SEJ.2021215T185132.v2.0.VZA.tif",
"href": "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10SEJ.2021215T185132.v2.0/HLS.L30.T10SEJ.2021215T185132.v2.0.VZA.tif"
},
"B09": {
"title": "Download HLS.L30.T10SEJ.2021215T185132.v2.0.B09.tif",
"href": "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10SEJ.2021215T185132.v2.0/HLS.L30.T10SEJ.2021215T185132.v2.0.B09.tif"
},
"VAA": {
"title": "Download HLS.L30.T10SEJ.2021215T185132.v2.0.VAA.tif",
"href": "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10SEJ.2021215T185132.v2.0/HLS.L30.T10SEJ.2021215T185132.v2.0.VAA.tif"
},
"B07": {
"title": "Download HLS.L30.T10SEJ.2021215T185132.v2.0.B07.tif",
"href": "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10SEJ.2021215T185132.v2.0/HLS.L30.T10SEJ.2021215T185132.v2.0.B07.tif"
},
"SZA": {
"title": "Download HLS.L30.T10SEJ.2021215T185132.v2.0.SZA.tif",
"href": "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10SEJ.2021215T185132.v2.0/HLS.L30.T10SEJ.2021215T185132.v2.0.SZA.tif"
},
"B01": {
"title": "Download HLS.L30.T10SEJ.2021215T185132.v2.0.B01.tif",
"href": "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10SEJ.2021215T185132.v2.0/HLS.L30.T10SEJ.2021215T185132.v2.0.B01.tif"
},
"B06": {
"title": "Download HLS.L30.T10SEJ.2021215T185132.v2.0.B06.tif",
"href": "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10SEJ.2021215T185132.v2.0/HLS.L30.T10SEJ.2021215T185132.v2.0.B06.tif"
},
"SAA": {
"title": "Download HLS.L30.T10SEJ.2021215T185132.v2.0.SAA.tif",
"href": "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10SEJ.2021215T185132.v2.0/HLS.L30.T10SEJ.2021215T185132.v2.0.SAA.tif"
},
"Fmask": {
"title": "Download HLS.L30.T10SEJ.2021215T185132.v2.0.Fmask.tif",
"href": "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10SEJ.2021215T185132.v2.0/HLS.L30.T10SEJ.2021215T185132.v2.0.Fmask.tif"
},
"browse": {
"title": "Download HLS.L30.T10SEJ.2021215T185132.v2.0.jpg",
"href": "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-public/HLSL30.020/HLS.L30.T10SEJ.2021215T185132.v2.0/HLS.L30.T10SEJ.2021215T185132.v2.0.jpg",
"type": "image/jpeg"
},
"metadata": {
"href": "https://cmr.earthdata.nasa.gov/search/concepts/G2099379244-LPCLOUD.xml",
"type": "application/xml"
},
"B12": {},
"B08": {},
"B8A": {}
}
}
]
Next, let's load the browse image to get a quick view of the first feature.
browse_image_url <- Feature1$assets$browse$href
browse_req <- GET(browse_image_url) %>%
httr::content()
plot(0:1, 0:1, type = "n",ann=FALSE,axes=FALSE)
rasterImage(browse_req,0, 0, 1,1)
Our first view of NASA data from the cloud, right here in RStudio!
As you can see in our image, there is quite a bit of cloud cover. Cloud cover is a property we can explore for each STAC Item using 'eo:cloud_cover' property. Also recall that each STAC Item in this case, contains multiple data assets. Therefore, the cloud cover percentage that is returned applies to all data assets associated with the STAC Item.
for (i in row.names(search_features)){
cc <- search_features[i,]$properties$`eo:cloud_cover`
d <- search_features[i,]$properties$datetime
cat('Cloud Coverage of Feature ',i, 'Captured on ', d , 'Is: ' , cc , 'Percent' ,'\n')
}
Cloud Coverage of Feature 1 Captured on 2021-08-03T18:51:32.308Z Is: 0 Percent
Cloud Coverage of Feature 2 Captured on 2021-08-07T19:03:43.241Z Is: 12 Percent
Cloud Coverage of Feature 3 Captured on 2021-08-12T18:45:24.905Z Is: 35 Percent
Cloud Coverage of Feature 4 Captured on 2021-08-12T19:03:40.891Z Is: 29 Percent
Cloud Coverage of Feature 5 Captured on 2021-08-17T19:03:36.679Z Is: 21 Percent
Cloud Coverage of Feature 6 Captured on 2021-08-22T19:03:39.304Z Is: 1 Percent
Cloud Coverage of Feature 7 Captured on 2021-08-27T19:03:42.219Z Is: 18 Percent
Cloud Coverage of Feature 8 Captured on 2021-08-28T18:45:29.398Z Is: 0 Percent
Cloud Coverage of Feature 9 Captured on 2021-09-01T19:03:37.146Z Is: 0 Percent
Cloud Coverage of Feature 10 Captured on 2021-09-04T18:51:42.520Z Is: 0 Percent
Cloud Coverage of Feature 11 Captured on 2021-09-06T19:03:40.937Z Is: 14 Percent
Cloud Coverage of Feature 12 Captured on 2021-09-11T19:03:36.765Z Is: 1 Percent
Cloud Coverage of Feature 13 Captured on 2021-09-13T18:45:33.772Z Is: 1 Percent
Cloud Coverage of Feature 14 Captured on 2021-09-16T19:03:42.255Z Is: 0 Percent
Cloud Coverage of Feature 15 Captured on 2021-09-20T18:51:45.506Z Is: 0 Percent
Cloud Coverage of Feature 16 Captured on 2021-09-29T18:45:37.906Z Is: 0 Percent
This section will demonstrate how to leverage the FieldBoundary.geojson boundaries to perform a spatial subset as well as identify the bands we are interested in and access those data.
First, set up rgdal configurations to access the cloud assets that we are interested in. You can learn more about these configuration options here.
rgdal::setCPLConfigOption(ConfigOption = "GDAL_HTTP_UNSAFESSL", value = "YES")
rgdal::setCPLConfigOption(ConfigOption = "GDAL_HTTP_COOKIEFILE", value = ".rcookies")
rgdal::setCPLConfigOption(ConfigOption = "GDAL_HTTP_COOKIEJAR", value = ".rcookies")
rgdal::setCPLConfigOption(ConfigOption = "GDAL_DISABLE_READDIR_ON_OPEN", value = "YES")
rgdal::setCPLConfigOption(ConfigOption = "CPL_VSIL_CURL_ALLOWED_EXTENSIONS", value = "TIF")
To subset assets by band, we will filter bands to include the NIR, Red, and Quality (Fmask) layers in the list of links to access. Below you can find the different band numbers for each of the two products. Additional information on HLS band allocations can be found here.
Sentinel 2:
Landsat 8:
Below, we'll make a searchable data table including links to assets. The granule_list
object is defined to store these links. Our final step will be a searchable data table!
granule_list <- list()
n <- 1
for (item in row.names(search_features)){ # Get the NIR, Red, and Quality band layer names
if (search_features[item,]$collection == 'HLSS30.v2.0'){
ndvi_bands <- c('B8A','B04','Fmask')
}
else{
ndvi_bands <- c('B05','B04','Fmask')
}
for(b in ndvi_bands){
f <- search_features[item,]
b_assets <- f$assets[[b]]$href
df <- data.frame(Collection = f$collection, # Make a data frame including links and other info
Granule_ID = f$id,
Cloud_Cover = f$properties$`eo:cloud_cover`,
band = b,
Asset_Link = b_assets, stringsAsFactors=FALSE)
granule_list[[n]] <- df
n <- n + 1
}
}
# Create a searchable datatable
search_df <- do.call(rbind, granule_list)
DT::datatable(search_df)
Next, we'll want to remove the items with too much cloud coverage. We will remove items over 30% cloud cover here.
search_df <- search_df[search_df$Cloud_Cover < 30, ]
Accessing files in the cloud requires you to authenticate using your NASA Earthdata
Login account. In the prerequisite section of this tutorial, a proper netrc file
has been set up by calling earthdata_netrc_setup.R
script.
cat('The netrc file can be found in:', Sys.getenv('HOME'))
Below, convert the GeoJSON projection from (EPSG: 4326) into the native projection of HLS, UTM (aligned to the Military Grid Reference System). This must be done in order to use the Region of Interest (ROI) to subset the COG files. If you have trouble running the code chunk below, make sure your credentials are entered correctly in the created netrc file.
crs(cropland_geojson)
CRS arguments: +proj=longlat +datum=WGS84 +no_defs
```{r} coordinate_reference <- raster(paste0(search_df$Asset_Link[1])) # Extract the CRS from one of the assets cropland_geojson_utm <- spTransform(cropland_geojson, crs(coordinate_reference)) # Transfer CRS crs(cropland_geojson_utm)
CRS arguments:
+proj=utm +zone=10 +ellps=WGS84 +units=m +no_defs
Below, we create a list of raster layers for each of our bands of interest
(i.e., Red, NIR, and Fmask). Next each band is read into the memory and then
cropped to our area of interest. The Cropped raster will be stacked and We'll use
these stacks to calculate Normalized Difference Vegetation Index (NDVI) and mask
for cloud contamination.
Note that the `raster` function is making a separate
request for each data layer located in the Cumulus cloud archive. This takes time,
and the more data layers we have in the time series, the longer this cell takes to
run.
```{r}
red_stack <- nir_stack <- fmask_stack <- list()
# Add progress bar
pb = txtProgressBar(min = 0, max = length(search_df$band), initial = 0, style = 3)
l <- m <- n <- 0
for (row in seq(length(search_df$band))){
setTxtProgressBar(pb,row)
if (search_df$band[row] == 'B04'){
l = l+1
red <- raster(search_df$Asset_Link[row])
red_crop <- raster::mask (raster::crop(red, extent(cropland_geojson_utm)), cropland_geojson_utm)
red_stack[[l]] <- red_crop
rm (red, red_crop)
}else if (search_df$band[row] == 'Fmask'){
m = m+1
fmask <- raster(search_df$Asset_Link[row])
fmask_crop <- raster::mask (raster::crop(fmask, extent(cropland_geojson_utm)), cropland_geojson_utm)
fmask_stack[[m]] <- fmask_crop
rm(fmask, fmask_crop)
}else{
n = n+1
nir <- raster(search_df$Asset_Link[row])
nir_crop <- raster::mask (raster::crop(nir, extent(cropland_geojson_utm)), cropland_geojson_utm)
nir_stack [[n]] <- nir_crop
rm(nir, nir_crop)
}
}
close(pb)
Now, the cropped and stacked rasters are loaded into memory without being downloaded. Running the code below should confirm this is true.
inMemory(nir_stack[[1]])
[1] TRUE
Now we can start asking our science questions. First we define the NDVI function and then execute it on the data loaded into memory. After that, we can perform quality filtering to screen out any poor-quality observations.
Create a function to calculate NDVI.
calculate_NDVI <- function(nir, red){
ndvi <- (nir-red)/(nir+red)
return(ndvi)
}
Now we can calculate NDVI from stacked Red and NIR rasters. We will create layer names from the dates the data is captured, along with the first letter of L and S shows which sensor is data from.
ndvi_stack <- list()
for (i in 1:length(nir_stack)){
# Calculate NDVI
ndvi_stack[[i]] <- calculate_NDVI (nir_stack[[i]], red_stack[[i]])
# Exclude the Inf and -Inf from NDVI
ndvi_stack[[i]][ndvi_stack[[i]] == Inf] <- NA
ndvi_stack[[i]][ndvi_stack[[i]] == -Inf] <- NA
layer_name <- names(nir_stack[[i]]) # Extract date from layer names
date <- strsplit(layer_name, "[.]")[[1]][4]
date <- substr(date[1], 1, 7)
date_time <- as.Date(as.integer(substr(date,5,7)), origin = paste0(substr(date,1,4), "-01-01"))
if (strsplit(layer_name, ".T")[[1]][1] == 'HLS.S30'){ # Set band names
names(ndvi_stack[[i]]) <- paste0('S',as.character(date_time))
}else{
names(ndvi_stack[[i]]) <- paste0('L',as.character(date_time))
}
}
ndvi_stacks <- stack(ndvi_stack) # Create a stack of NDVI
Now we plot! Let's star with the first item in NDVI time series.
# Create a color palette
pal <- colorNumeric(terrain.colors(n = 100), c(0,1) ,na.color = "transparent", reverse = TRUE)
leaflet() %>%
addProviderTiles(providers$Esri.WorldImagery) %>%
addRasterImage(ndvi_stacks[[1]], color = pal, opacity = 1) %>%
addPolygons(data = cropland_geojson, fill = FALSE) %>%
addMiniMap(zoomLevelFixed = 5) %>%
leaflet::addLegend(pal = pal, values = c(0,1), title = "NDVI")
Now we should do additional quality filtering.
In HLS, both value of 0 and 64 in the Fmask layer indicate the pixel without cloud, cloud shadow, water, or snow/ice. A value of 0 also shows climatology aerosol level and 64 shows low aerosol level. We will use these values to mask out poor quality pixels from the ndvi_stacks. HLS quality information can be found in section 6.5 of the HLS V2.0 User Guide.
Create a stack of masked rasters and mask poor quality pixels.
mask_raster <- list()
ndvi_filtered <- list()
for (i in 1:length(fmask_stack)){
mask_raster[[i]] <- fmask_stack[[i]]
mask_raster[[i]][values(mask_raster[[i]])!= 0 || values(mask_raster[[i]])!= 64] <- NA
ndvi_filtered[[i]] <- mask(ndvi_stacks[[i]], mask_raster[[i]], maskvalue=NA )
names(ndvi_filtered[[i]]) <- names(ndvi_stacks[[i]])
}
ndvi_filtered_stacks <- stack(ndvi_filtered)
Now we can plot multiple layers to create an interactive NDVI time
series map with leaflet
. Click on the dates on the left side to view
the layer.
base<-c('map<-leaflet()%>%
addProviderTiles(providers$Esri.WorldImagery) %>%
addMiniMap(zoomLevelFixed = 5) %>%')
# make a string including the addRasterImage function for every layer in the raster stack
X <- lapply(1:nlayers(ndvi_filtered_stacks), function(j){
paste(paste("addRasterImage(ndvi_filtered_stacks[[",j,"]],
colors = pal,
opacity=1,
group=names(ndvi_filtered_stacks[[",j,"]]))", sep=""),"%>% \n")})
X <- do.call(paste, X)
controls<-"addLayersControl(baseGroups=names(ndvi_stacks),
options = layersControlOptions(collapsed=F), position = 'topleft')%>%"
legend <- "leaflet::addLegend(pal = pal, values = c(0,1), title = 'NDVI')"
final <- paste(base, X, controls, legend ,sep="\n")
eval(parse(text=final))
map
Above, the time series show the changes in NDVI starting 2021. The NDVI values are increasing over time meaning dormant fields in winter begin greening as we move into spring. It is also interesting that the spatial variation in NDVI can be observed. This shows that each field is cultivated differently.
Note that there are some empty observations (e.g. S2021.04.04) after quality filtering.
Below, plot the time series as boxplots showing the distribution of NDVI values for our farm fields.
raster::boxplot(ndvi_filtered_stacks, col=c('olivedrab3'), main='NDVI Time Series', ylab='NDVI', names = names(ndvi_stacks), las=2)
Next, calculate the statistics for each observation and export to CSV. Quality filtered raster stack is used to calculate the statistics.
ndvi_mean <- cellStats(ndvi_filtered_stacks, stat='mean', na.rm=TRUE)
ndvi_max <- cellStats(ndvi_filtered_stacks, stat='max', na.rm=TRUE)
ndvi_min <- cellStats(ndvi_filtered_stacks, stat='min', na.rm=TRUE)
ndvi_sd <- cellStats(ndvi_filtered_stacks, stat='sd', na.rm=TRUE)
ndvi_median <- cellStats(ndvi_filtered_stacks, stat='median', na.rm=TRUE)
Make interactive plots of raster statistics using dygraphs
library. The date is
formatted using lubridate
package and xts
package is used to transform the
dataframe to the xts format.
stats <- data.frame(
Date=substr(names(ndvi_filtered_stacks), 2,11),
NDVI_Max = ndvi_max,
NDVI_Min = ndvi_min,
NDVI_Median = ndvi_median,
NDVI_mean = ndvi_mean,
NDVI_SD = ndvi_sd
)
stats$Date <- ymd(stats$Date) # reformat the date
variables = xts(x=stats[,-1], order.by=stats$Date) # Choose the cols with the variables
dygraph(variables)
If you want to export these statistics, we can do so to a CSV file.
stats_name <- file.path(outDir, "NDVI_Statistics.csv")
write.csv(stats,stats_name)
Finally, if you want to capture the final output files locally on your machine, you can export the output files as GeoTIFFs.
for (i in 1:length(ndvi_filtered)){
output_name <- file.path(outDir, paste0(names(ndvi_filtered[[i]]), "_NDVI.tif"))
writeRaster(ndvi_filtered[[i]], output_name, overwrite = TRUE)
}
The RasterStack object can also be written to the disk.
output_name <- file.path(outDir, "NDVI_stack.tif")
writeRaster(ndvi_filtered_stacks ,filename=output_name, prj = TRUE , overwrite=TRUE)
And we're done! You have successfully analyzed data in the cloud, exporting just the information you needed for your area of interest rather than having to download everything.