Working with Daily NASA VIIRS Surface Reflectance Data

This tutorial demonstrates how to work with the daily S-NPP NASA VIIRS Surface Reflectance (VNP09GA.001) data product.

The Land Processes Distributed Active Archive Center (LP DAAC) distributes the NASA-produced surface reflectance data products from the Visible Infrared Imaging Radiometer Suite (VIIRS) sensor aboard the joint NOAA/NASA Suomi-National Polar-orbiting Partnership (S-NPP) satellite. The S-NPP NASA VIIRS Surface Reflectance products are archived and distributed in the HDF-EOS5 file format. The Hierarchical Data Format version 5 (HDF5) is a NASA selected format of choice for standard science product archival and distribution and is the underlying format for HDF-EOS5. HDF-EOS is the standard format and I/O library for the Earth Observing System (EOS) Data and Information System (EOSDIS). HDF-EOS5 extends the capabilities of the HDF5 storage format, adding support for the various EOS data types (e.g., point, swath, grid) into the HDF5 framework.

In this tutorial, you will use Python to define the coordinate reference system (CRS) and export science dataset (SDS) layers as GeoTIFF files that can be loaded into a GIS and/or Remote Sensing software program.


Example: Converting VNP09GA files into quality-filtered vegetation indices and examining the trend in vegetation greenness during July 2018 to observe drought in Europe.

Data Used in the Example:

  • Data Product: VIIRS/NPP Surface Reflectance Daily L2G Global 1km and 500m SIN Grid Version 1 (VNP09GA.001)
    • Science Dataset (SDS) layers:
      • SurfReflect_M3_1
      • SurfReflect_M4_1
      • SurfReflect_M5_1
      • SurfReflect_M7_1

Topics Covered:

  1. Getting Started
    1a. Import Packages
    1b. Set Up the Working Environment
    1c. Retrieve Files
  2. Importing and Interpreting Data
    2a. Open a VIIRS HDF5-EOS File and Read File Metadata
    2b. Subset SDS Layers and read SDS Metadata
  3. Generating an RGB Composite
    3a. Apply a Scale Factor
    3b. Create an Image
  4. Quality Filtering
    4a. Decode Bit Values
    4b. Apply Quality and Land/Water Masks
  5. Functions
    5a. Defining Functions
    5b. Executing Functions
  6. Visualizing Results
    6a. Plot Quality-Filtered VIs
    6b. Scatterplot Comparison of NDVI vs. EVI
  7. Georeferencing
  8. Exporting Results
    8a. Set Up a Dictionary
    8b. Export Masked Arrays as GeoTIFFs
  9. Working with Time Series
    9a. Automation of Steps 2-5, 7-8
    9b. Time Series Analysis
    9c. Time Series Visualization

Before Starting this Tutorial:

Dependencies:

  • This tutorial was tested using Python 3.6.1.
  • A NASA Earthdata Login account is required to download the data used in this tutorial. You can create an account at the link provided.
  • Libraries: pandas, numpy, h5py, matplotlib, datetime, skimage
  • To execute Sections 7-9, the Geospatial Data Abstraction Library (GDAL) is required.

Example Data:

This tutorial uses the VNP09GA.001 tile h18v03 product on July 2, 2018. Use this link to download the file directly from the LP DAAC Data Pool: https://e4ftl01.cr.usgs.gov/VIIRS/VNP09GA.001/2018.07.02/VNP09GA.A2018183.h18v03.001.2018184074906.h5.

Section 9 features time series data, and you will need to download all of the files in order to execute it. All of the files used in Section 9 can be downloaded via a text file containing links to each of the files on the LP DAAC Data Pool or via NASA Earthdata Search using the query in the link provided.

Source Code used to Generate this Tutorial:

NOTE: This tutorial was developed for NASA VIIRS Daily Surface Reflectance HDF-EOS5 files and should only be used for the VNP09GA.001 product.


1. Getting Started

1a. Import Packages

Import the python packages required to complete this tutorial.

In [1]:
import h5py, os
import numpy as np
import matplotlib.pyplot as plt
from osgeo import gdal, gdal_array
import datetime as dt
import pandas as pd
from skimage import exposure

1b. Set Up the Working Environment

Define the input directory containing the VNP09GA.001 files, and create an output directory to store the results from this tutorial.

Change the inDir variable below to the path of the directory where your data is stored (e.g., C:/Data/VIIRS/VNP09A1/).

In [2]:
inDir = 'D:/NASA_VIIRS_SurfaceReflectance/'                             # Update to dir on your OS containing VIIRS files
os.chdir(inDir)                                                         # Change to working directory
outDir = os.path.normpath(os.path.split(inDir)[0]+os.sep+'output')+'\\' # Set output directory
if not os.path.exists(outDir): os.makedirs(outDir)                      # Create output directory

If you plan to execute this tutorial on your own OS, `inDir` above needs to be changed.


1c. Retrieve Files

In this section, a series of VNP09GA .h5 files have been downloaded to the inDir defined above.

Make sure that you download the files into the inDir directory defined above to follow along in the tutorial.

In [3]:
inFile = 'VNP09GA.A2018183.h18v03.001.2018184074906.h5' # File used in example
outName = inFile.rsplit('.', 1)[0]                      # Parse out the file extension, keep file name
outName
Out[3]:
'VNP09GA.A2018183.h18v03.001.2018184074906'

The standard format for VIIRS filenames is as follows:

VNP09GA: Product Short Name
A2018183: Julian Date of Acquisition (AYYYYDDD)
h18v03: Tile Identifier (horizontalXXverticalYY)
001: Collection Version
2018184074906: Julian Date of Production (YYYYDDDHHMMSS)

Below, take the Julian Date of Acquisition and convert to calendar date using the datetime package.

In [4]:
yeardoy = inFile.split('.')[1][1:]                                  # Split filename to retrieve acquisition date
date = dt.datetime.strptime(yeardoy, '%Y%j').strftime('%m/%d/%Y')   # Convert YYYYDDD to MM/DD/YYYY
date
Out[4]:
'07/02/2018'

2. Importing and Interpreting Data

2a. Open a VNP09GA HDF-EOS5 File and Read File Metadata

Read in a VNP09GA HDF-EOS5 file using the h5py package.

In [5]:
f = h5py.File(inFile) # Read in VIIRS HDF-EOS5 file and set to variable
In [6]:
list(f.keys()) # List groups in file
Out[6]:
['HDFEOS',
 'HDFEOS INFORMATION',
 'SensorAzimuth_c',
 'SensorZenith_c',
 'SolarAzimuth_c',
 'SolarZenith_c',
 'SurfReflect_I1_c',
 'SurfReflect_I2_c',
 'SurfReflect_I3_c',
 'SurfReflect_M10_c',
 'SurfReflect_M11_c',
 'SurfReflect_M1_c',
 'SurfReflect_M2_c',
 'SurfReflect_M3_c',
 'SurfReflect_M4_c',
 'SurfReflect_M5_c',
 'SurfReflect_M7_c',
 'SurfReflect_M8_c',
 'SurfReflect_QF1_c',
 'SurfReflect_QF2_c',
 'SurfReflect_QF3_c',
 'SurfReflect_QF4_c',
 'SurfReflect_QF5_c',
 'SurfReflect_QF6_c',
 'SurfReflect_QF7_c',
 'iobs_res_c',
 'nadd_obs_row_1km',
 'nadd_obs_row_500m',
 'obscov_1km_c',
 'obscov_500m_c',
 'orbit_pnt_c']

The VNP09GA HDF-EOS5 file contains two groups in which data and metadata are stored.

First, the HDFEOS group contains the VNP09GA SDS layers.

In [7]:
list(f['HDFEOS']) # List contents of HDFEOS group
Out[7]:
['ADDITIONAL', 'GRIDS']

Continue into the GRIDS directory.

In [8]:
grids = list(f['HDFEOS']['GRIDS']) # List contents of GRIDS directory
grids
Out[8]:
['VNP_Grid_1km_2D', 'VNP_Grid_500m_2D']

TheVNP09GA data product provides data for three imagery bands (I1, I2, I3) at nominal 500 meter resolution (~ 463 meter) and nine moderate-resolution bands (M1, M2, M3, M4, M5, M7, M8, M10, M11) at nominal 1 kilometer (km) (~926 meter) resolution. This example uses the nominal 1 km M bands.

In [9]:
list(f['HDFEOS']['GRIDS']['VNP_Grid_1km_2D']) # List contents of 1km directory
Out[9]:
['Data Fields']

Below, list the SDS contained in the HDF-EOS5 file at 1 km resolution.

In [10]:
list(f['HDFEOS']['GRIDS']['VNP_Grid_1km_2D']['Data Fields']) # List Data Fields
Out[10]:
['SensorAzimuth_1',
 'SensorZenith_1',
 'SolarAzimuth_1',
 'SolarZenith_1',
 'SurfReflect_M10_1',
 'SurfReflect_M11_1',
 'SurfReflect_M1_1',
 'SurfReflect_M2_1',
 'SurfReflect_M3_1',
 'SurfReflect_M4_1',
 'SurfReflect_M5_1',
 'SurfReflect_M7_1',
 'SurfReflect_M8_1',
 'SurfReflect_QF1_1',
 'SurfReflect_QF2_1',
 'SurfReflect_QF3_1',
 'SurfReflect_QF4_1',
 'SurfReflect_QF5_1',
 'SurfReflect_QF6_1',
 'SurfReflect_QF7_1',
 'num_observations_1km',
 'obscov_1km_1',
 'orbit_pnt_1']

Second, the HDFEOS INFORMATION group contains the metadata for the file. The StructMetadata.0 object stores the file metadata.

In [11]:
list(f['HDFEOS INFORMATION']) # List contents of HDFEOS group
Out[11]:
['StructMetadata.0']

Below, read the file metadata and store as a list.

In [12]:
fileMetadata = f['HDFEOS INFORMATION']['StructMetadata.0'].value.split() # Read file metadata
fileMetadata = [m.decode('utf-8') for m in fileMetadata]                 # Clean up file metadata
fileMetadata[0:33]                                                       # Print a subset of the entire file metadata record
Out[12]:
['GROUP=SwathStructure',
 'END_GROUP=SwathStructure',
 'GROUP=GridStructure',
 'GROUP=GRID_1',
 'GridName="VNP_Grid_1km_2D"',
 'XDim=1200',
 'YDim=1200',
 'UpperLeftPointMtrs=(0.000000,6671703.118000)',
 'LowerRightMtrs=(1111950.519667,5559752.598333)',
 'Projection=HE5_GCTP_SNSOID',
 'ProjParams=(6371007.181000,0,0,0,0,0,0,0,0,0,0,0,0)',
 'SphereCode=-1',
 'GridOrigin=HE5_HDFE_GD_UL',
 'GROUP=Dimension',
 'OBJECT=Dimension_1',
 'DimensionName="YDim"',
 'Size=1200',
 'END_OBJECT=Dimension_1',
 'OBJECT=Dimension_2',
 'DimensionName="XDim"',
 'Size=1200',
 'END_OBJECT=Dimension_2',
 'END_GROUP=Dimension',
 'GROUP=DataField',
 'OBJECT=DataField_1',
 'DataFieldName="num_observations_1km"',
 'DataType=H5T_NATIVE_SCHAR',
 'DimList=("YDim","XDim")',
 'MaxdimList=("YDim","XDim")',
 'CompressionType=HE5_HDFE_COMP_DEFLATE',
 'DeflateLevel=8',
 'TilingDimensions=(30,1200)',
 'END_OBJECT=DataField_1']

Identify all the objects in the VNP09GA HDF-EOS5 file below.

In [13]:
h5_objs = []            # Create empty list
f.visit(h5_objs.append) # Walk through directory tree, retrieve objects and append to list
h5_objs
Out[13]:
['HDFEOS',
 'HDFEOS/ADDITIONAL',
 'HDFEOS/ADDITIONAL/FILE_ATTRIBUTES',
 'HDFEOS/GRIDS',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SensorAzimuth_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SensorZenith_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SolarAzimuth_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SolarZenith_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M10_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M11_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M1_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M2_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M3_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M4_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M5_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M7_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M8_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF1_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF2_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF3_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF4_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF5_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF6_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF7_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/num_observations_1km',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/obscov_1km_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/orbit_pnt_1',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/SurfReflect_I1_1',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/SurfReflect_I2_1',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/SurfReflect_I3_1',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/iobs_res_1',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/num_observations_500m',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/obscov_500m_1',
 'HDFEOS INFORMATION',
 'HDFEOS INFORMATION/StructMetadata.0',
 'SensorAzimuth_c',
 'SensorZenith_c',
 'SolarAzimuth_c',
 'SolarZenith_c',
 'SurfReflect_I1_c',
 'SurfReflect_I2_c',
 'SurfReflect_I3_c',
 'SurfReflect_M10_c',
 'SurfReflect_M11_c',
 'SurfReflect_M1_c',
 'SurfReflect_M2_c',
 'SurfReflect_M3_c',
 'SurfReflect_M4_c',
 'SurfReflect_M5_c',
 'SurfReflect_M7_c',
 'SurfReflect_M8_c',
 'SurfReflect_QF1_c',
 'SurfReflect_QF2_c',
 'SurfReflect_QF3_c',
 'SurfReflect_QF4_c',
 'SurfReflect_QF5_c',
 'SurfReflect_QF6_c',
 'SurfReflect_QF7_c',
 'iobs_res_c',
 'nadd_obs_row_1km',
 'nadd_obs_row_500m',
 'obscov_1km_c',
 'obscov_500m_c',
 'orbit_pnt_c']

2b. Subset SDS Layers and read SDS Metadata

Identify and generate a list of all the SDS layers in the HDF-EOS5 file.

In [14]:
# Search for SDS with 1km or 500m grid
all_datasets = [obj for grid in grids for obj in h5_objs if isinstance(f[obj],h5py.Dataset) and grid in obj] 
all_datasets
Out[14]:
['HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SensorAzimuth_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SensorZenith_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SolarAzimuth_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SolarZenith_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M10_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M11_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M1_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M2_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M3_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M4_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M5_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M7_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_M8_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF1_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF2_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF3_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF4_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF5_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF6_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/SurfReflect_QF7_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/num_observations_1km',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/obscov_1km_1',
 'HDFEOS/GRIDS/VNP_Grid_1km_2D/Data Fields/orbit_pnt_1',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/SurfReflect_I1_1',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/SurfReflect_I2_1',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/SurfReflect_I3_1',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/iobs_res_1',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/num_observations_500m',
 'HDFEOS/GRIDS/VNP_Grid_500m_2D/Data Fields/obscov_500m_1']

Read in the near infrared (SurfReflect_M7), red (SurfReflect_M5), green (SurfReflect_M4), and blue (SurfReflect_M3) SDS layers from the HDF-EOS5 file. These layers will be used later in the tutorial to calculate daily vegetation indices.

In [15]:
r = f[[a for a in all_datasets if 'M5' in a][0]] # M5 = Red
g = f[[a for a in all_datasets if 'M4' in a][0]] # M4 = Green
b = f[[a for a in all_datasets if 'M3' in a][0]] # M3 = Blue
n = f[[a for a in all_datasets if 'M7' in a][0]] # M7  = NIR

List the attributes for one of the SDS.

In [16]:
list(r.attrs) # List attributes
Out[16]:
['long_name',
 'units',
 'valid_range',
 '_FillValue',
 'Offset',
 'Scale',
 'FILL_VALUES']

Extract the scale factor and fill value from the SDS metadata.

In [17]:
scaleFactor = r.attrs['Scale'][0]    # Set scale factor to a variable
fillValue = r.attrs['_FillValue'][0] # Set fill value to a variable

3. Generating an RGB Composite

3a. Apply a Scale Factor

Extract the data for each SDS and apply the scale factor defined in the previous section. The scale factor is the same for all surface reflectance M-bands, so you can simply use the scale factor from the red band for the other bands.

In [18]:
red = r.value * scaleFactor
green = g.value * scaleFactor
blue = b.value * scaleFactor
nir = n.value * scaleFactor 

Create an RGB array using np.dstack() and set the fill values equal to nan.

In [19]:
rgb = np.dstack((red,green,blue))            # Create RGB array
rgb[rgb == fillValue * scaleFactor] = 0      # Set fill value equal to 0

3b. Create an Image

Before filtering the data for poor quality observations such as clouds, visualize the non-filtered RGB image using the matplotlib package.

In [20]:
# Set plots inside of the Jupyter Notebook
%matplotlib inline 

Below, use the skimage package's exposure module (http://scikit-image.org/docs/dev/api/skimage.exposure.html) to apply a contrast stretch and gamma correction to the RGB composite for a better looking image.

In [21]:
p2, p98 = np.percentile(rgb, (2, 98))                              # Calculate 2nd,98th percentile for updating min/max vals
rgbStretched = exposure.rescale_intensity(rgb, in_range=(p2, p98)) # Perform contrast stretch on RGB range
rgbStretched = exposure.adjust_gamma(rgbStretched, 0.5)            # Perform Gamma Correction

Note: stretching the array above inherently changes the values of the array, and those values should be used for visualization purposes only.

Take the stretched RGB composite and plot the results.

In [22]:
    fig = plt.figure(figsize =(10,10))                           # Set the figure size
    ax = plt.Axes(fig,[0,0,1,1]) 
    ax.set_axis_off()                                            # Turn off axes
    fig.add_axes(ax)
    ax.imshow(rgbStretched, interpolation='bilinear', alpha=0.9) # Plot a natural color RGB
    fig.savefig('{}{}_RGB.jpg'.format(outDir, outName))          # Export natural color RGB as jpeg