Working with ECOSTRESS Evapotranspiration Data

This tutorial demonstrates how to work with the ECOSTRESS Evapotranspiration PT-JPL Daily L3 Global 70m Version 1 (ECO3ETPTJPL.001) data product in Python.

The Land Processes Distributed Active Archive Center (LP DAAC) distributes the Ecosystem Spaceborne Thermal Radiometer Experiment on Space Station (ECOSTRESS) data products. The ECOSTRESS mission is tasked with measuring the temperature of plants to better understand how much water plants need and how they respond to stress. ECOSTRESS products are archived and distributed in the HDF5 file format as swath-based products.

In this tutorial, you will use Python to perform a swath to grid conversion to project the swath data on to a grid with a defined coordinate reference system (CRS), compare ECOSTRESS data with ground-based AmeriFlux flux tower observations, and export science dataset (SDS) layers as GeoTIFF files that can be loaded into a GIS and/or Remote Sensing software program.


Example: Converting a swath ECO3ETPTJPL.001 HDF5 file into a GeoTIFF with a defined CRS and comparing ECOSTRESS Evapotranspiration (ET) with ground-based ET observations from an AmeriFlux flux tower location in California.

Data Used in the Example:


Topics Covered:

  1. Getting Started
    1a. Import Packages
    1b. Set Up the Working Environment
    1c. Retrieve Files
  2. Importing and Interpreting Data
    2a. Open an ECOSTRESS HDF5 File and Read File Metadata
    2b. Subset SDS Layers
  3. Performing Swath2grid Conversion
    3a. Import Geolocation File
    3b. Define Projection and Output Grid
    3c. Read SDS Metadata
    3d. Perform K-D Tree Resampling
    3e. Basic Image Processing
  4. Exporting Results
    4a. Set Up a Dictionary
    4b. Define CRS and Export as GeoTIFFs
  5. Combining ECOSTRESS and AmeriFlux Tower Data
    5a. Loading Tables with Pandas
    5b. Locate ECOSTRESS Pixel from Lat/Lon Coordinates
  6. Visualizing Data
    6a. Create Colormap
    6b. Plot ET Data
    6c. Exporting an Image
  7. Comparing Observations
    7a. Calculate Distribution of ECOSTRESS Data
    7b. Visualize Ground Observations
    7c. Combine ECOSTRESS and Ground Observations

Before Starting this Tutorial:

If you are simply looking to batch process/perform the swath2grid conversion for ECOSTRESS files, be sure to check out the ECOSTRESS Swath to Grid Conversion Script.

NOTE: This tutorial was developed specifically for the ECOSTRESS Evapotranspiration PT-JPL Level 3, Version 1 HDF5 files and will need to be adapted to work with other ECOSTRESS products.

Dependencies:

Disclaimer: This tutorial has been tested on Windows and MacOS using the specifications identified below.


Procedures:

Getting Started:

1. This tutorial uses data from ECOSTRESS Version 1, including an ECO3ETPTJPL.001 (and accompanying ECO1BGEO.001) observation from August 05, 2018. You can download the files directly from the LP DAAC Data Pool at:

The tower_data.csv file will need to be downloaded into the same directory as the tutorial in order to execute the tutorial.

2. Copy/clone/download the ECOSTRESS Tutorial repo, or the desired tutorial from the LP DAAC Data User Resources Repository:


1. Getting Started


1a. Import Packages

Import the python packages required to complete this tutorial.


1b. Set Up the Working Environment

The input directory is defined as the current working directory. Note that you will need to have the jupyter notebook and example data (.h5 and .csv) stored in this directory in order to execute the tutorial successfully.

Make sure that the ECOSTRESS .h5 data files, and Ameriflux ET data file (.csv) are located in the input directory printed above.


1c. Retrieve Files

Make sure that the ECO1BGEO and ECO3ETPTJPL .h5 files listed in the directions have been downloaded to the inDir defined above to follow along in the tutorial.

The standard format for ECOSTRESS filenames is as follows:

ECOSTRESS_L3_ET_PT-JPL: Product Type
00468: Orbit number; starting at start of mission, ascending equatorial crossing
007: Scene ID; starting at first scene of first orbit
20180805T220314: Date and time of data start: YYYYMMDDThhmmss
0601: Build ID of software that generated product, Major+Minor (2+2 digits)
04: Product version number (2 digits)


2. Importing and Interpreting Data


2a. Open an ECOSTRESS HDF5 File

Read in an ECOSTRESS HDF5 file using the h5py package.


2b. Subset SDS Layers and read SDS Metadata

Identify and generate a list of all the SDS layers in the HDF5 file.

Below, subset the SDS list to the two layers needed for comparison with the ground-based AmeriFlux data, ETinst and ETinstUncertainty.


3. Performing Swath2grid Conversion

Resample the native ECOSTRESS swath data to a grid with defined coordinate reference system (CRS).


3a. Import Geolocation File

The latitude and longitude arrays from the ECO1BGEO product for the same ECOSTRESS orbit/scene ID are needed to perform the swath2grid conversion on the ECO3ETPT-JPL file.

Read in the ECO1BGEO file, search for the latitude and longitude SDS, and import into Python as arrays.


3b. Define Projection and Output Grid

The latitude and longitude arrays from the ECO1BGEO product for the same ECOSTRESS orbit/scene ID are needed to perform the swath2grid conversion on the ECO3ETPT-JPL file.

The following sections use the pyresample package to resample the ECOSTRESS swath dataset to a grid using nearest neighbor method. This process begins by defining the swath dimensions using the lat/lon arrays below.

Define the coordinates in the middle of the swath, which are used to calculate an estimate of the output rows/columns for the gridded output.

Below, pyproj.Proj is used to perform a cartographic transformation by defining an Azimuthal Equidistant projection centered on the midpoint of the swath. Once the projection is defined, convert the lower left and upper right corners of the lat/lon arrays to a location (in meters) in the new projection. Lastly, measure the distance between the corners and divide by 70 (meters), the nominal pixel size that we are aiming for. Azimuthal Equidistant projection was chosen here based on the following characteristics of this projection:

Use number of rows and columns generated above from the AEQD projection to set a representative number of rows and columns in the Geographic area definition, which will then be translated to degrees below, then take the smaller of the two pixel dims to determine output size and ensure square pixels.

Below, square the pixels by setting the pixel size to the smaller of the x any y values output by the AreaDefinition, then use the pixel size to recalculate the number of output cols/rows.

Below, use pyresample kd_tree's get_neighbour_info to create arrays with information on the nearest neighbor to each grid point.

This is the most computationally heavy task in the swath2grid conversion and using get_neighbour_info speeds up the process if you plan to resample multiple SDS within an ECOSTRESS product (compute once instead of for every SDS).

Above, the function is comparing the swath and area definitions to locate the nearest neighbor (neighbours=1). 210 is the radius_of_influence, or the radius used to search for the nearest neighboring pixel in the swath (in meters).


3c. Read SDS Metadata

List the attributes for the ETinst layer, which can then be used to define the fill value and scale factor.

Extract the scale factor, add offset and fill value from the SDS metadata.


3d. Perform K-D Tree Resampling

Remember that the resampling has been split into two steps. In section 3b. arrays containing the nearest neighbor to each grid point were created. The second step is to use those arrays to retrieve a resampled result.

Above, resample the swath ecoSD array using nearest neighbor (already calculated in section 3b. and defined above as the index, outdex, and indexArr), and also set the fill value that was defined in section 3c.

Below, define the geotransform for the output (upper left x, horizontal pixel size, rotation, upper left y, rotation, vertical pixel size).


3e. Basic Image Processing

Apply the scale factor and add offset and set the fill value defined in the previous section on the resampled data.

Rerun steps 3c - 3e for ETinstUncertainty.


4. Exporting Results


4a. Set Up a Dictionary

In this section, create a dictionary containing each of the arrays that will be exported as GeoTIFFs.


4b. Define CRS and Export as GeoTIFFs

Now that the data have been imported and resampled into a gridded raster array, export the results as GeoTIFFs using a for loop in this section.


5. Combining ECOSTRESS and AmeriFlux Tower Data


5a. Loading Tables with Pandas

In this section, begin by highlighting how to open a csv file using the pandas package.

The AmeriFlux tower data was provided by Mike Goulden for the AmeriFlux US-CZ3 tower. The csv includes half-hourly observations of Latent Heat (W/m$^{2}$) for the same day as the ECOSTRESS observation.

Next, use the parser package and a lambda function to go through each time stamp and reformat to date and time objects.


5b. Locate ECOSTRESS Pixel from Lat/Lon Coordinates

Calculate the gridded pixel nearest to the tower location.


6. Visualizing Data


6a. Create a Colormap

Before plotting the ET data, set up an Evapotranspiration color map using LinearSegmentedColormap from the matplotlib package.


6b. Calculate Local Overpass Time

ECOSTRESS observation times are reported in Universal Time Coordinated (UTC). Below, grab the observation time from the filename and convert to local time using the longitude location of the tower.

Next, convert UTC observation time to local overpass time.


6c. Plot ET Data

In this section, begin by highlighting the functionality of the matplotlib plotting package. First, make a plot of the entire gridded ET output. Next, zoom in on the tower location and add some additional parameters to the plot. Finally, export the completed plot to an image file.


6d. Exporting an Image

Zoom in to get a closer look at the region surrounding the AmeriFlux tower by creating a subset.

Make another plot, this time zoomed in to the tower location. Export the plot as a .png file.


7. Comparing Observations


7a. Calculate Distribution of ECOSTRESS Data

First, collect a 3x3 grid centered on the flux tower pixel as a subset to calculate statistics on.

In case the 3x3 grid contains missing values, use np.nanmedian to ignoring missing values and calculate the measure of central tendency.

Next, generate a probability density function for the 3x3 grid of ET values.


7b. Visualize Ground Observations

Next, examine the series of eddy covariance observations from the AmeriFlux US-CZ3 dataset.

Above, we can see the daily range in Latent Heat as captured by the eddy covariance observations on the flux tower.


7c. Combine ECOSTRESS and Ground Observations

Finally, compare the ECOSTRESS Evapotranspiration and uncertainty with the time series of observations from the flux tower.


Citations


Contact Information

Material written by Cole Krehbiel$^{1}$ and Gregory Halverson$^{2}$

Contact: LPDAAC@usgs.gov

Voice: +1-866-573-3222

Organization: Land Processes Distributed Active Archive Center (LP DAAC)

Website: https://lpdaac.usgs.gov/

Date last modified: 03-11-2021

$^{1}$Innovate! Inc., contractor to the U.S. Geological Survey, Earth Resources Observation and Science (EROS) Center, Sioux Falls, South Dakota, 57198-001, USA. Work performed under USGS contract G15PD00467 for LP DAAC$^{3}$.

$^{3}$LP DAAC Work performed under NASA contract NNG14HH33I.

$^{2}$ Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA, USA