Published: May 26, 2021
The Global Ecosystem Dynamics Investigation (GEDI) mission aims to characterize ecosystem structure and dynamics to enable radically improved quantification and understanding of the Earth's carbon cycle and biodiversity. The Land Processes Distributed Active Archive Center (LP DAAC) distributes the GEDI Level 1 and Level 2 Version 1 and Version 2 products. The LP DAAC created the GEDI Finder Web Service to allow users to perform spatial queries of GEDI Version 1 L1-L2 full-orbit granules. One of the updates for GEDI Version 2 included additional spatial metadata that allows users to perform spatial queries via a graphical user interface (GUI) using NASA's Earthdata Search or programmatically using NASA's Common Metadata Repository (CMR). Another update is that each GEDI V1 full-orbit granule has been divided into 4 sub-orbit granules in V2.
This tutorial was developed using an example use case for a current GEDI Finder user who has been using the GEDI Finder web service in R to find intersecting GEDI L2A Version 1 full-orbit granules over the Amazon Rainforest. The user is now looking to use the same workflow to find intersecting GEDI L2A V2 sub-orbit granules.
This tutorial will show how to use R to perform a spatial query for GEDI V2 data using NASA's CMR, how to reformat the CMR response into a list of links pointing to the intersecting sub-orbit granules on the LP DAAC Data Pool, and how to export the list of links as a text file.
This tutorial is written as an R Markdown Notebook. In order to execute the tutorial, users will need to have R/RStudio installed, including the required packages to execute an R Markdown notebook. httr
is the only package used in this tutorial.
The only package used in this tutorial is httr
.
# Check for required packages, install if not previously installed
if ("httr" %in% rownames(installed.packages()) == FALSE) { install.packages("httr")}
# Import Packages
library(httr)
In the code cell below, define a function called gedi_finder
that takes two user-submitted input values, a product
and a bbox
.
There are three available products for this function, including 'GEDI01_B.002', 'GEDI02_A.002' and 'GEDI02_B.002'. A list is set up to relate each product shortname.version
to its associated concept_id
, which is a value used by NASA's CMR to retrieve data for a specific product. Additional information on concept ID's can be found in the CMR Search API Documentation.
The second user-submitted input value, bbox
is a string of bounding box coordinate values (decimal degrees) in the following format:
Lower Left Longitude, Lower Left Latitude, Upper Right Longitude, Upper Right Latitude ("LLLon,LLLat,URLon,URLat")
Example:
'-73.65,-12.64,-47.81,9.7'
# Define Function to Query CMR
gedi_finder <- function(product, bbox) {
# Define the base CMR granule search url, including LPDAAC provider name and max page size (2000 is the max allowed)
cmr <- "https://cmr.earthdata.nasa.gov/search/granules.json?pretty=true&provider=LPDAAC_ECS&page_size=2000&concept_id="
# Set up list where key is GEDI shortname + version and value is CMR Concept ID
concept_ids <- list('GEDI01_B.002'='C1908344278-LPDAAC_ECS',
'GEDI02_A.002'='C1908348134-LPDAAC_ECS',
'GEDI02_B.002'='C1908350066-LPDAAC_ECS')
# CMR uses pagination for queries with more features returned than the page size
page <- 1
bbox <- sub(' ', '', bbox) # Remove any white spaces
granules <- list() # Set up a list to store and append granule links to
# Send GET request to CMR granule search endpoint w/ product concept ID, bbox & page number
cmr_response <- GET(sprintf("%s%s&bounding_box=%s&pageNum=%s", cmr, concept_ids[[product]],bbox,page))
# Verify the request submission was successful
if (cmr_response$status_code==200){
# Send GET request to CMR granule search endpoint w/ product concept ID, bbox & page number, format return as a list
cmr_url <- sprintf("%s%s&bounding_box=%s&pageNum=%s", cmr, concept_ids[[product]],bbox,page)
cmr_response <- content(GET(cmr_url))$feed$entry
# If 2000 features are returned, move to the next page and submit another request, and append to the response
while(length(cmr_response) %% 2000 == 0){
page <- page + 1
cmr_url <- sprintf("%s%s&bounding_box=%s&pageNum=%s", cmr, concept_ids[[product]],bbox,page)
cmr_response <- c(cmr_response, content(GET(cmr_url))$feed$entry)
}
# CMR returns more info than just the Data Pool links, below use for loop to grab each DP link, and add to list
for (i in 1:length(cmr_response)) {
granules[[i]] <- cmr_response[[i]]$links[[1]]$href
}
# Return the list of links
return(granules)
} else {
# If the request did not complete successfully, print out the response from CMR
print(content(GET(sprintf("%s%s&bounding_box=%s&pageNum=%s", cmr, concept_ids[[product]],bbox,page)))$errors)
}
}
The function returns a list of links to download the intersecting GEDI sub-orbit V2 granules directly from the LP DAAC's Data Pool.
gedi_finder
function to variables.# User-provided inputs (UPDATE FOR YOUR DESIRED PRODUCT AND BOUNDING BOX REGION OF INTEREST)
product <- 'GEDI02_B.002' # Options include 'GEDI01_B.002', 'GEDI02_A.002', 'GEDI02_B.002'
bbox <- '-73.65,-12.64,-47.81,9.7' # bounding box coords in LL Longitude, LL Latitude, UR Longitude, UR Latitude format
Above, the variables are defined to query the GEDI02_B.002
product for a bounding box covering the Amazon Rainforest.
Next, call the gedi_finder
function for the desired product and bounding box region of interest defined above, and set the output to a variable.
# Call the gedi_finder function using the user-provided inputs
granules <- gedi_finder(product, bbox)
print(sprintf("%s %s Version 2 granules found.", length(granules), product))
Notice the print statement above will notify you how many granules intersected your bounding box for the product requested.
Below is a demonstration of how to take the granules
list of Data Pool links for intersecting GEDI V2 granules and export as a text file. The text file will be written to your current working directory, and will be named based on the date and time that the file was created.
# Export Results
# Set up output textfile name using the current datetime
outName <- sprintf("%s_GranuleList_%s.txt", sub('.002', '_002', product), format(Sys.time(), "%Y%m%d%H%M%S"))
# Save to text file in current working directory
write.table(granules, outName, row.names = FALSE, col.names = FALSE, quote = FALSE, sep='\n')
print(sprintf("File containing links to intersecting %s Version 2 data has been saved to: %s/%s", product, getwd(), outName))
Looking to bulk download the intersecting GEDI V2 files from your request? Check out the following LP DAAC resources to get you started:
1. How to Access LP DAAC Data from the Command Line
2. How to Access the LP DAAC Data Pool with Python
3. How to Access the LP DAAC Data Pool with R
Also be sure to check out the following GEDI Resources:
GEDI Spatial Querying and Subsetting Quick Guide V2
Explains how to perform spatial querying and subsetting of GEDI V2 data directly in NASA's Earthdata Search Client
GEDI Spatial and Band/layer Subsetting and Export to GeoJSON (GEDI Subsetter) Script
Allows you to subset GEDI V2 data by band/layer and region of interest
Getting Started with GEDI L1B, L2A, and L2B V2 Data in Python Tutorial Series
Includes a series of tutorials that demonstrate how to start working with GEDI V2 data in Python.