4 Satellite Data Acquisition Software and Settings
Currently, all satellite data in AquaMatch are obtained using the Python API for Google Earth Engine Gorelick (2023). While the orchestration of data acquisition is performed by the {targets} workflow, all code directly related to GEE data acquisition is written in Python. If you are running the ‘default’ configuration for lakeSR or siteSR the following directions are not applicable.
4.1 R Environment and Package Information
We recommend using the most recent release of R, R Studio, {targets} and all R packages used within the workflow. The most recent run of lakeSR and siteSR was completed using R version 4.5.0 (R Core Team 2025) in R Studio 2025.05.0 (Posit team 2023). The table below lists the packages used across the lakeSR and siteSR workflow, versions used, and the associated citation.
| R Package | version | citation |
|---|---|---|
| arrow | 20.0.0.2 | Richardson et al. (2025) |
| bookdown | 0.43 | Xie (2016); Xie (2025) |
| config | 0.3.2 | Allaire (2023) |
| cowplot | 1.1.3 | Wilke (2024) |
| crew | 1.1.2 | Landau (2025) |
| data.table | 1.17.4 | Barrett et al. (2025) |
| deming | 1.4-1 | Therneau (2024) |
| ggrepel | 0.9.6 | Slowikowski (2024) |
| ggthemes | 5.1.0 | Arnold (2024) |
| googledrive | 2.1.1 | McGowan and Bryan (2023) |
| kableExtra | 1.4.0 | Zhu (2024) |
| nhdplusTools | 0.6.2 | Blodgett and Johnson (2023) |
| polylabelr | 0.3.0 | Larsson (2024) |
| reticulate | 1.42.0 | Ushey, Allaire, and Tang (2025) |
| rmapshaper | 0.5.0 | Teucher and Russell (2023) |
| sf | 1.0-21 | Pebesma (2018); Pebesma and Bivand (2023) |
| tarchetypes | 0.13.1 | Landau (2021a) |
| targets | 1.11.3 | Landau (2021b) |
| tidyverse | 2.0.0 | Wickham et al. (2019) |
| tigris | 2.2.1 | Walker (2025) |
| USA.state.boundaries | 1.0.1 | Embry (2022) |
| viridis | 0.6.5 | Garnier et al. (2024) |
| xml2 | 1.3.8 | Wickham, Hester, and Ooms (2025) |
| yaml | 2.3.10 | Garbett et al. (2024) |
4.2 {reticulate} Conda Environment
RStudio (Posit team 2023) is an integrated development environment that, alongside the {reticulate} package (Ushey, Allaire, and Tang 2025), facilitates integration of R and Python code within the same environment. In AquaMatch, we use a single R script to set up a {reticulate} Conda environment that is invoked at the beginning of a {targets} run to be sure that our Python code runs consistently.
| Software/Python Module | version | citation |
|---|---|---|
| Python | 3.10.13 | Python Software Foundation, www.python.org |
| earthengine-api | 1.4.0 | Gorelick (2023) |
| pandas | 2.0.3 | The pandas development team (2023) |
| pyreadr | 0.5.2 | Fajardo (2023) |
| PyYAML | 6.0.2 | The PyYAML Project, https://github.com/yaml/pyyaml |
| numpy | 1.24.4 | Harris et al. (2020) |
The script run_targets.Rmd includes the steps to create this environment and
authenticate your GEE user. These steps should be run prior to running the
pipeline to assure a smooth run of the workflow.
4.3 Google Earth Engine Setup
If running the ‘admin_update’ configuration for either lakeSR or siteSR, you will need to have a GEE account, the gcloud Command Line Interface (CLI) installed and configured, create an Earth Engine Project, have successful authenticated your account, and you will need to alter the configuration file. All of these tasks are described in the section below.
4.3.1 Create a GEE account
Creation of a GEE account is free. Click ‘Get Started’ at the far right side of the earthengine.google.com webpage to create an account:
4.3.2 gcloud CLI
This workflow requires the installation and initiation of gcloud
CLI, a command-line tool set for accessing
Google Cloud Resources. All settings for AquaSat v2 are default gcloud
configurations using a single GEE project. The link above documents how to set
up gcloud.
4.3.3 GEE project setting
AquaMatch is run in a specific GEE project associated with our authenticated
Google account. If you wish to re-run this code as written, you will not have
proper access because the code refers to our specific GEE project. You will need
to update the config yaml (in lakeSR:
b_pull_Landsat_SRST_poi/config_files/config_poi.yml, in siteSR:
gee_config.yml) with your Google credentials and GEE project in order to run
the pipeline locally. If you are new to GEE, go to
code.earthengine.google.com and enter the
project name listed in the top right hand corner of your screen:
Alternatively, you can create a GEE project for this task in the dropdown menu accessed by clicking on the icon to the right of the highlighted box in the figure above. This workflow will not run without specifying an Earth Engine Project that is managed by the Google Account you authenticate this run with.
4.3.4 GEE Authentication
Once gcloud is installed and initialized, the configuration file is properly
set up, and the Python Conda environment is set up, you can authenticate your GEE
instance. For this workflow, this is completed in the run_targets.Rmd script
at the root directory. This script provides explicit directions to complete this
task before running the pipeline.

