Section 7 Data Fusion

We’ll now test and approach that uses both aerial point cloud data (for structural information) and RGB imagery (for spectral information) which is a data fusion approach. First, we’ll identify initial candidate slash piles based on their structural form using our raster-based watershed segmentation approach. Then, we’ll use the RGB imagery to filter these candidates spectrally: relevant RGB vegetation indices such as Green Red Vegetation Index (GRVI), Red Green Ratio Index (RGRI), Visible Band-Difference Vegetation Index (VDVI), Red Green Ratio Index (RGRI), Red Green Blue Vegetation Index (RGBVI), and Excess Green (ExG) are calculated for each candidate segment, and thresholds are applied to remove those exhibiting high greenness. Index thresholds tested include those found to perform well in distinguising green vegetation in previous research (Motohka et al. 2010; Wang et al. 2025; Riehle et al. 2020). Filtering candidate segments using spectral data will enhance our method’s ability to distinguish non-photosynthetic slash piles from living vegetation (e.g., small trees or shrubs) that might share similar structural profiles.

While RGB vegetation indices are good for separating green biomass, differentiating between various shades of brown, black, and white of non-vegetated surfaces like slash piles, rocks, and bare soil can be more challenging. Many common rock-forming minerals are “spectrally featureless” in the visible range, often appearing pale grey to white (Harris et al. 2010). As such, we will attempt to filter out very dark (black) or very light (white) boulders from the reddish-brown tones common of wood in slash piles using RGB indices that focus on overall brightness and color purity (Kior et al. 2024). The Brightness Index (BI) can help us exclude extremely dark (black; low BI) or bright (white; high BI) objects. Setting a minimum threshold for saturation (i.e. color purity) can remove achromatic features (like grey, black, or white objects that have very low saturation values close to 0) to help us isolate more chromatic (brown) slash piles.

This data fusion methodology leverages the complementary strengths of both data types, using 3D geometry for initial object segmentation and 2D spectral data to refine detections by excluding green biomass and potential rock and soil features that appear very dark or very light, leading to more accurate slash pile identification.

7.1 RGB Indices

For these formulas, \(R\), \(G\), and \(B\) represent the raw pixel values for the Red, Green, and Blue bands, respectively. For indices that use normalized values, \(r\), \(g\), and \(b\) are defined as:

\(r = \frac{R}{R+G+B}\) \(g = \frac{G}{R+G+B}\) \(b = \frac{B}{R+G+B}\)

Green Red Vegetation Index (GRVI)

\[GRVI = \frac{G - R}{G + R}\]

Red Green Ratio Index (RGRI)

\[RGRI = \frac{R}{G}\]

Visible Band-Difference Vegetation Index (VDVI)

\[VDVI = \frac{2G - R - B}{2G + R + B}\]

Red Green Blue Vegetation Index (RGBVI)

\[RGBVI = \frac{G^2 - (B \cdot R)}{G^2 + (B \cdot R)}\]

Excess Green (ExG)

\[ExG = 2g - r - b = \frac{2G - R - B}{R+G+B}\]

Excessive Red (ExR) \[ExR = 1.4r - g = \frac{1.4R - G}{R+G+B}\]

Excess Green-Excess Red (ExGR)

\[ExGR = 3g - 2.4r - b = \frac{3G - 2.4R - B}{R+G+B}\]

Brightness Index (BI)

\[BI = \sqrt{\frac{R^2+G^2+B^2}{3}}\]

Saturation

\[Saturation = \frac{\max(R,G,B) - \min(R,G,B)}{\max(R,G,B)}\]

7.1.1 Spectral Index Functions

Although there are dedicated R packages to calculate various spectral indicies (e.g. RStoolbox [Muller et al. 2025]), we’ll make our own to ensure data quality checks and to limit the spectral indices to those highlighted above.

# check raster bands and extract rgb layers
check_raster_bands <- function(rast, red_band_idx, green_band_idx, blue_band_idx) {
  # convert to SpatRaster if input is from 'raster' package
  if(
    inherits(rast, "RasterStack") 
    || inherits(rast, "RasterBrick")
  ){
    rast <- terra::rast(rast)
  }else if(!inherits(rast, "SpatRaster")){
    stop("Input 'rast' must be a SpatRaster from the `terra` package")
  }

  # check if band indices are valid
  num_bands <- terra::nlyr(rast)
  # let 999999 be a cheat code
  cheat_code <- 999999
  
  if(
    ( red_band_idx!=cheat_code && red_band_idx > num_bands )
    || ( green_band_idx!=cheat_code && green_band_idx > num_bands )
    || ( blue_band_idx!=cheat_code &&  blue_band_idx > num_bands )
    || ( red_band_idx!=cheat_code && red_band_idx < 1 )
    || ( green_band_idx!=cheat_code && green_band_idx < 1 )
    || ( blue_band_idx!=cheat_code &&  blue_band_idx < 1 )
    || length(unique(c(red_band_idx,green_band_idx,blue_band_idx)))!=3
  ){
    stop("Invalid band index provided. Band indices must correspond to existing, unique layers in the raster object.")
  }

  # extract bands
  if(red_band_idx!=cheat_code){ R <- rast[[red_band_idx]] }else{ R <- rast[[1]] }
  if(green_band_idx!=cheat_code){ G <- rast[[green_band_idx]] }else{ G <- rast[[1]] }
  if(blue_band_idx!=cheat_code){ B <- rast[[blue_band_idx]] }else{ B <- rast[[1]] }

  return(list(R = R, G = G, B = B))
}
# check_raster_bands(ortho_rast, red_band_idx=1, green_band_idx=3, blue_band_idx = 999999)

# calculate green red vegetation index (GRVI)
# (G - R) / (G + R)
spectral_index_grvi <- function(rast, red_band_idx, green_band_idx) {
  bands <- check_raster_bands(rast, red_band_idx, green_band_idx, blue_band_idx = 999999) # blue_band_idx is dummy here
  R <- bands$R
  G <- bands$G

  grvi <- (G - R) / (G + R)
  names(grvi) <- "grvi"
  return(grvi)
}
# spectral_index_grvi(ortho_rast,red_band_idx = 1, green_band_idx = 2) %>% terra::plot()

# red green ratio index (RGRI)
# R/G
spectral_index_rgri <- function(rast, red_band_idx, green_band_idx) {
  bands <- check_raster_bands(rast, red_band_idx, green_band_idx, blue_band_idx = 999999) # blue_band_idx is dummy here
  R <- bands$R
  G <- bands$G

  rgri <- R/G
  names(rgri) <- "rgri"
  return(rgri)
}
# spectral_index_grvi(ortho_rast,red_band_idx = 1, green_band_idx = 2) %>% terra::plot()

# calculate visible band-difference vegetation index (VDVI)
# (2G - R - B) / (2G + R + B)
spectral_index_vdvi <- function(rast, red_band_idx, green_band_idx, blue_band_idx) {
  bands <- check_raster_bands(rast, red_band_idx, green_band_idx, blue_band_idx)
  R <- bands$R
  G <- bands$G
  B <- bands$B

  vdvi <- (2 * G - R - B) / (2 * G + R + B)
  names(vdvi) <- "vdvi"
  return(vdvi)
}
# spectral_index_vdvi(ortho_rast,red_band_idx = 1, green_band_idx = 2, blue_band_idx = 3) %>% terra::plot()

# calculate red green blue vegetation index (RGBVI)
# (G^2 - (B * R)) / (G^2 + (B * R))
spectral_index_rgbvi <- function(rast, red_band_idx, green_band_idx, blue_band_idx) {
  bands <- check_raster_bands(rast, red_band_idx, green_band_idx, blue_band_idx)
  R <- bands$R
  G <- bands$G
  B <- bands$B

  rgbvi <- (G^2 - (B * R)) / (G^2 + (B * R))
  names(rgbvi) <- "rgbvi"
  return(rgbvi)
}
# spectral_index_rgbvi(ortho_rast,red_band_idx = 1, green_band_idx = 2, blue_band_idx = 3) %>% terra::plot()

# calculate excess green (ExG)
# (2G - R - B) / (R + G + B) (using normalized RGB values)
# 2G - R - B (using raw values, then normalized by sum of R+G+B)
spectral_index_exg <- function(rast, red_band_idx, green_band_idx, blue_band_idx) {
  bands <- check_raster_bands(rast, red_band_idx, green_band_idx, blue_band_idx)
  R <- bands$R
  G <- bands$G
  B <- bands$B

  # Calculate normalized RGB values
  sum_rgb <- R + G + B
  r_norm <- R / sum_rgb
  g_norm <- G / sum_rgb
  b_norm <- B / sum_rgb

  exg <- (2 * g_norm - r_norm - b_norm)
  names(exg) <- "exg"
  return(exg)
}
# spectral_index_exg(ortho_rast,red_band_idx = 1, green_band_idx = 2, blue_band_idx = 3) %>% terra::plot()

# calculate brightness index (BI)
# sqrt((R^2 + G^2 + B^2) / 3)
spectral_index_bi <- function(rast, red_band_idx, green_band_idx, blue_band_idx) {
  bands <- check_raster_bands(rast, red_band_idx, green_band_idx, blue_band_idx)
  R <- bands$R
  G <- bands$G
  B <- bands$B

  bi <- sqrt((R^2 + G^2 + B^2) / 3)
  # normalize to 0-1 range
  max_brightness = max(
    terra::minmax(R)[2]
    , terra::minmax(G)[2]
    , terra::minmax(B)[2]
    , na.rm = T
  )
  
  bi <- bi/max_brightness
  names(bi) <- "bi"
  return(bi)
}
# spectral_index_bi(ortho_rast,red_band_idx = 1, green_band_idx = 2, blue_band_idx = 3) %>% terra::plot()

# calculate excessive red (ExR)
# 1.4r - g, where r and g are normalized RGB values.
spectral_index_exr <- function(rast, red_band_idx, green_band_idx, blue_band_idx){
  bands <- check_raster_bands(rast, red_band_idx, green_band_idx, blue_band_idx)
  R <- bands$R
  G <- bands$G
  B <- bands$B

  # Calculate normalized RGB values
  sum_rgb <- R + G + B
  r_norm <- R / sum_rgb
  g_norm <- G / sum_rgb

  exr <- (1.4 * r_norm - g_norm)
  names(exr) <- "exr"
  return(exr)
}

#' calculate excess green-excess red (ExGR)
#' 3g - 2.4r - b, where r, g, b are normalized RGB values.
#' equivalent to ExG - ExR.
spectral_index_exgr <- function(rast, red_band_idx, green_band_idx, blue_band_idx) {
  bands <- check_raster_bands(rast, red_band_idx, green_band_idx, blue_band_idx)
  R <- bands$R
  G <- bands$G
  B <- bands$B

  # Calculate normalized RGB values
  sum_rgb <- R + G + B
  r_norm <- R / sum_rgb
  g_norm <- G / sum_rgb
  b_norm <- B / sum_rgb

  exgr <- (3 * g_norm - 2.4 * r_norm - b_norm)
  names(exgr) <- "exgr"
  return(exgr)
}

# calculate saturation (SAT)
# (max(R,G,B) - min(R,G,B)) / max(R,G,B)
spectral_index_saturation <- function(rast, red_band_idx, green_band_idx, blue_band_idx) {
  bands <- check_raster_bands(rast, red_band_idx, green_band_idx, blue_band_idx)
  R <- bands$R
  G <- bands$G
  B <- bands$B

  max_rgb <- max(R, G, B)
  min_rgb <- min(R, G, B)
  sat <- (max_rgb - min_rgb) / max_rgb
  names(sat) <- "sat"
  return(sat)
}
# spectral_index_saturation(ortho_rast,red_band_idx = 1, green_band_idx = 2, blue_band_idx = 3) %>% terra::plot()

# calculate all indices
calculate_all_rgb_indices <- function(raster_obj, red_band_idx, green_band_idx, blue_band_idx) {
  # call individual index functions
  grvi_layer <- spectral_index_grvi(raster_obj, red_band_idx, green_band_idx)
  rgri_layer <- spectral_index_rgri(raster_obj, red_band_idx, green_band_idx)
  vdvi_layer <- spectral_index_vdvi(raster_obj, red_band_idx, green_band_idx, blue_band_idx)
  rgbvi_layer <- spectral_index_rgbvi(raster_obj, red_band_idx, green_band_idx, blue_band_idx)
  exg_layer <- spectral_index_exg(raster_obj, red_band_idx, green_band_idx, blue_band_idx)
  exr_layer <- spectral_index_exr(raster_obj, red_band_idx, green_band_idx, blue_band_idx)
  exgr_layer <- spectral_index_exgr(raster_obj, red_band_idx, green_band_idx, blue_band_idx)
  bi_layer <- spectral_index_bi(raster_obj, red_band_idx, green_band_idx, blue_band_idx)
  sat_layer <- spectral_index_saturation(raster_obj, red_band_idx, green_band_idx, blue_band_idx)

  # stack all calculated indices into a single spatraster
  all_indices <- c(
    grvi_layer
    , rgri_layer
    , vdvi_layer
    , rgbvi_layer
    , exg_layer
    , exr_layer
    , exgr_layer
    , bi_layer
    , sat_layer
  )

  return(all_indices)
}

let’s test this calculate_all_rgb_indices() function with our orthomosaic

# calculate_all_rgb_indices
all_rgb_indices_rast <- calculate_all_rgb_indices(ortho_rast,red_band_idx = 1, green_band_idx = 2, blue_band_idx = 3)
# huh?
all_rgb_indices_rast %>% names()
## [1] "grvi"  "rgri"  "vdvi"  "rgbvi" "exg"   "exr"   "exgr"  "bi"    "sat"

let’s plot all of those indices for the entire extent of our orthomoasic

# plot
terra::plot(
  all_rgb_indices_rast
  , nc = 3
  # , nr = 3
  , mar = c(0.5,0.5,2,0.5)
  , axes = FALSE
  , legend = F
)

we can also look at the correlation between the different indices

# investigate correlation among covariates
all_rgb_indices_rast %>% 
  terra::pairs(
    maxcells = min(11111, terra::ncell(all_rgb_indices_rast)*.01)
  )

many of these spectral indices are highly (even perfectly) correlated, meaning they provide redundant information. to streamline our method and ensure we utilize unique spectral insights for distinguishing slash piles from the surrounding terrain and other objects, we will select only one index from each correlated group. GRVI, RGRI, and ExR are highly correlated, as are VDVI, RGBVI, and ExG. we’ll choose one from each of these sets.

7.1.2 Spectral Index of Polygons

we now need a function to crop the raster with all spectral indices given a polygon input data and return the spectral index values as columns attached to the polygon

extract_rast_values <- function(sf_data, rast, fun_agg = mean) {
  # checks
  if(!inherits(rast, "SpatRaster")){
    stop("Input `rast` must be a SpatRaster object.")
  }
  if(!inherits(sf_data, "sf")){
    stop("Input `sf_data` must be an sf data frame.")
  }
  if(!all(sf::st_geometry_type(sf_data) %in% c("POLYGON", "MULTIPOLYGON"))) {
    stop("Input `sf_data` must contain polygon geometries.")
  }
  if(!is.function(fun_agg)) {
    stop("Argument `fun_agg` must be a function (e.g., mean, median, sum).")
  }

  # crs
  sf_data <- sf_data %>% sf::st_transform(terra::crs(rast))

  # extract values for each layer within each polygon
  extracted_values <- terra::extract(
    x = rast
    , y = sf_data
    , fun = fun_agg
    , na.rm = TRUE
  )

  # clean data
  fun_name <- deparse(substitute(fun_agg))
  extracted_values <- extracted_values %>% 
    dplyr::select(-ID) %>% 
    dplyr::rename_with(
      ~ paste0(
        "rast_"
        # , fun_name ### if we want to have custom output depending on the fun_agg
        , "agg"
        , "_"
        , .x
        , recycle0 = TRUE
      )
    )

  # Merge the extracted values back to the original sf data frame
  # The row order is preserved by terra::extract, so a direct cbind is safe
  # if no rows were dropped due to spatial mismatch.
  # For robustness, we can explicitly join by row ID if needed, but for simple cases, cbind works.
  # Assuming sf_data has a unique ID column or row order is stable:
  sf_data_with_indices <- sf_data %>% dplyr::bind_cols(extracted_values)

  return(sf_data_with_indices)
}
# extract_rast_values(slash_piles_polys, rast = ortho_rast) %>% dplyr::glimpse()

7.2 Ground Truth Pile Spectral Summary

let’s calculate the various spectral indices on our ground truth slash pile polygons by getting the median value within the bounds of the pile

# extract_rast_values
rgb_indices_df <- extract_rast_values(slash_piles_polys, rast = all_rgb_indices_rast, fun_agg = median)
# quick summary
rgb_indices_df %>% 
  sf::st_drop_geometry() %>% 
  dplyr::select(tidyselect::starts_with("rast_agg_")) %>%
  summary()
##  rast_agg_grvi       rast_agg_rgri    rast_agg_vdvi       rast_agg_rgbvi    
##  Min.   :-0.044701   Min.   :0.7708   Min.   :-0.035258   Min.   :-0.06457  
##  1st Qu.:-0.019449   1st Qu.:0.9954   1st Qu.:-0.012092   1st Qu.:-0.02322  
##  Median :-0.006777   Median :1.0136   Median : 0.008204   Median : 0.01993  
##  Mean   :-0.002926   Mean   :1.0072   Mean   : 0.001518   Mean   : 0.00542  
##  3rd Qu.: 0.002293   3rd Qu.:1.0397   3rd Qu.: 0.014962   3rd Qu.: 0.03147  
##  Max.   : 0.129425   Max.   :1.0936   Max.   : 0.024391   Max.   : 0.04956  
##   rast_agg_exg       rast_agg_exr     rast_agg_exgr       rast_agg_bi     
##  Min.   :-0.04646   Min.   :0.02609   Min.   :-0.20124   Min.   :0.05384  
##  1st Qu.:-0.01606   1st Qu.:0.13401   1st Qu.:-0.15465   1st Qu.:0.32317  
##  Median : 0.01097   Median :0.14125   Median :-0.13261   Median :0.48469  
##  Mean   : 0.00212   Mean   :0.13701   Mean   :-0.13468   Mean   :0.41280  
##  3rd Qu.: 0.02005   3rd Qu.:0.15099   3rd Qu.:-0.11529   3rd Qu.:0.51978  
##  Max.   : 0.03279   Max.   :0.17662   Max.   :-0.03238   Max.   :0.60218  
##   rast_agg_sat    
##  Min.   :0.03628  
##  1st Qu.:0.06875  
##  Median :0.08491  
##  Mean   :0.10128  
##  3rd Qu.:0.11716  
##  Max.   :0.37621

let’s plot it

# pivot
agg_df_temp <- rgb_indices_df %>% 
  sf::st_drop_geometry() %>% 
  dplyr::select(tidyselect::starts_with("rast_agg_")) %>% 
  tidyr::pivot_longer(cols = dplyr::everything()) %>% 
  dplyr::mutate(name = stringr::str_remove_all(name,"rast_agg_") %>% stringr::str_to_upper())
# plot
ggplot2::ggplot() +
  ggplot2::geom_density(
    data = agg_df_temp
    , mapping = ggplot2::aes(x = value, fill = name)
    , color = NA, alpha = 0.8
  ) +
  ggplot2::geom_vline(
    data = agg_df_temp %>% dplyr::group_by(name) %>% dplyr::mutate(value=median(value,na.rm=T))
    , mapping = ggplot2::aes(xintercept = value, color = "median")
    , linetype = "solid", lwd = 1
  ) +
    ggplot2::geom_vline(
    data = agg_df_temp %>% dplyr::group_by(name) %>% dplyr::mutate(value=quantile(value,na.rm=T,probs=0.025))
    , mapping = ggplot2::aes(xintercept = value, color = "p95")
    , linetype = "solid", lwd = 1
  ) +
  ggplot2::geom_vline(
    data = agg_df_temp %>% dplyr::group_by(name) %>% dplyr::mutate(value=quantile(value,na.rm=T,probs=(1-0.025)))
    , mapping = ggplot2::aes(xintercept = value, color = "p95")
    , linetype = "solid", lwd = 1
  ) +
  ggplot2::scale_fill_viridis_d(option = "turbo", begin = 0.1, end = 0.9, alpha = 0.7) +
  # ggplot2::scale_fill_brewer(palette = "Set2") +
  ggplot2::scale_color_manual(values = c("gray22","gray","gray")) +
  ggplot2::scale_x_continuous(breaks = scales::breaks_extended(8)) +
  ggplot2::facet_wrap(
    facets = dplyr::vars(name)
    , ncol = 3
    , scales = "free"
  ) +
  ggplot2::labs(
    x = "", y = "", color = "", fill = ""
  ) +
  ggplot2::theme_light() +
  ggplot2::theme(
    legend.position = "top"
    , strip.text = ggplot2::element_text(size = 11, color = "black", face = "bold")
    , axis.text.x = ggplot2::element_text(size = 7)
    , axis.text.y = ggplot2::element_blank()
  ) + 
  ggplot2::guides(fill = "none")

that’s interesting. let’s see how those values compare with the thresholds identified in the research to identify green crops, trees, and shrubs using RGB spectral indicies

Index Likely Crop Likely Tree Likely Shrub Source
VDVI x > 0.06 x > 0.04 x > 0.03 Wang et al. 2025
RGRI x < 0.63 x < 0.70 x < 0.52 Wang et al. 2025
RGBVI x > -20.0 x > -22.0 x > -18.0 Wang et al. 2025
ExG x > 58.0 x > 59.5 x > 59.0 Wang et al. 2025
GRVI x > 0.0 x > 0.0 Motohka et al. 2010
ExGR x > 0.0 x > 0.0 x > 0.0 Riehle et al. 2020
  • VDVI: setting a threshold to remove candidate slash piles with VDVI>0.03 should work well
  • RGRI: setting a threshold to remove candidate slash piles with RGRI<0.70 should work well
  • RGBVI: setting a threshold to remove candidate slash piles with RGBVI>-18.0 should not work well
    • There appears to be a mis-match between the range of values we calculated ([-0.61, 1.00]) and their range ([-50, 78])
  • ExG: setting a threshold to remove candidate slash piles with ExG>59.5 should not work well
    • There appears to be a mis-match between the range of values we calculated ([-1.00, 1.91]) and their range ([0, 150])
  • GRVI: setting a threshold to remove candidate slash piles with GRVI>0.0 should work moderately well
    • We may need to raise this threshold to remove candidate slash piles with GRVI>0.05, for example
  • ExGR: setting a threshold to remove candidate slash piles with ExGR>0.0 should work well
  • BI: setting a threshold to remove candidate slash piles that are extremely dark (BI<0.10) or bright (BI>0.90) should work well
  • Saturation: setting a threshold to remove candidate slash piles that have low saturation values (SAT<0.05) should work well

Based on these thresholds identified in the literature and our objective to remove very highly correlated indices, we’ll proceed with the following five indices to refine the structurally detected candidate slash piles:

  1. RGRI
  2. VDVI
  3. ExGR
  4. BI
  5. Saturation (SAT)

7.2.1 Voting System

let’s consider a voting system approach for filtering candidate slash piles using the multiple spectral indices. a voting system could allow for a more robust and nuanced decision than relying on a single index.

let’s make a highly specialized function using the data returned by out extract_rast_values() function

rgb_indices_threshold_voting <- function(
  rgb_indices_df
  # define ranges to *keep* piles
  , th_rgri = c((0.7+0.01),Inf) # increase each by 0.01 since we'll be checking lower<=x<=upper
  , th_vdvi = c(-Inf,(0.03+0.01)) # increase each by 0.01 since we'll be checking lower<=x<=upper
  , th_exgr = c(-Inf,0)
  , th_bi = c(0.1,0.9)
  , th_sat = c(0.05,Inf)
){

  # checks
  if(!inherits(rgb_indices_df, "data.frame")){
      stop("Input `rgb_indices_df` must be an data.frame.")
  }
  # names
  agg_cols <- c("rast_agg_exgr","rast_agg_rgri","rast_agg_vdvi","rast_agg_bi","rast_agg_sat") # "rast_agg_rgbvi",
  nm_diff <- base::setdiff(
    agg_cols
    , names(rgb_indices_df)
  )
  if(length(nm_diff)>0){
    stop(paste0("required variables missing:\n", "... ", paste(nm_diff, collapse = ", ") ))
  }
  # thresholds
  if(length(th_exgr)!=2 || th_exgr[1]>th_exgr[2]){
    stop("Input `th_exgr` must be of length 2 as: c(lower,upper) defining the range of values to keep where lower<=upper")
  }
  if(length(th_rgri)!=2 || th_rgri[1]>th_rgri[2]){
    stop("Input `th_rgri` must be of length 2 as: c(lower,upper) defining the range of values to keep where lower<=upper")
  }
  if(length(th_vdvi)!=2 || th_vdvi[1]>th_vdvi[2]){
    stop("Input `th_vdvi` must be of length 2 as: c(lower,upper) defining the range of values to keep where lower<=upper")
  }
  if(length(th_bi)!=2 || th_bi[1]>th_bi[2]){
    stop("Input `th_bi` must be of length 2 as: c(lower,upper) defining the range of values to keep where lower<=upper")
  }
  if(length(th_sat)!=2 || th_sat[1]>th_sat[2]){
    stop("Input `th_sat` must be of length 2 as: c(lower,upper) defining the range of values to keep where lower<=upper")
  }
  
  # get rid of columns we'll create
  rgb_indices_df <- rgb_indices_df %>%
    # throw in hey_xxxxxxxxxx to test it works if we include non-existant columns
    dplyr::select( -dplyr::any_of(c(
      "hey_xxxxxxxxxx"
      , "inrange_th_exgr"
      , "inrange_th_rgri"
      , "inrange_th_vdvi"
      , "inrange_th_bi"
      , "inrange_th_sat"
    )))
  
  # check threshold
  ret_df <- rgb_indices_df %>% 
    # dplyr::select(dplyr::all_of(agg_cols)) %>% 
    dplyr::mutate(
      inrange_th_exgr = ifelse(
          !is.na(rast_agg_exgr)
          & rast_agg_exgr >= th_exgr[1]
          & rast_agg_exgr <= th_exgr[2]
          , 1, 0
        )
      , inrange_th_rgri = ifelse(
          !is.na(rast_agg_rgri)
          & rast_agg_rgri >= th_rgri[1]
          & rast_agg_rgri <= th_rgri[2]
          , 1, 0
        )
      , inrange_th_vdvi = ifelse(
          !is.na(rast_agg_vdvi)
          & rast_agg_vdvi >= th_vdvi[1]
          & rast_agg_vdvi <= th_vdvi[2]
          , 1, 0
        )
      , inrange_th_bi = ifelse(
          !is.na(rast_agg_bi)
          & rast_agg_bi >= th_bi[1]
          & rast_agg_bi <= th_bi[2]
          , 1, 0
        )
      , inrange_th_sat = ifelse(
          !is.na(rast_agg_sat)
          & rast_agg_sat >= th_sat[1]
          & rast_agg_sat <= th_sat[2]
          , 1, 0
        )
    ) %>% 
    dplyr::rowwise() %>%
    dplyr::mutate(
      inrange_th_votes = sum(
        dplyr::c_across(tidyselect::starts_with("inrange_th_"))
        , na.rm = T
      ) %>% 
      dplyr::coalesce(0)
    ) %>%
    ungroup()
  
  #return
  return(ret_df)
}

let’s look at the columns we get from our rgb_indices_threshold_voting() function

rgb_indices_df <- rgb_indices_threshold_voting(rgb_indices_df=rgb_indices_df)
# huh?
rgb_indices_df %>% 
  sf::st_drop_geometry() %>% 
  dplyr::select(tidyselect::starts_with("inrange_th_")) %>% 
  summary()
##  inrange_th_exgr inrange_th_rgri inrange_th_vdvi inrange_th_bi  
##  Min.   :1       Min.   :1       Min.   :1       Min.   :0.000  
##  1st Qu.:1       1st Qu.:1       1st Qu.:1       1st Qu.:1.000  
##  Median :1       Median :1       Median :1       Median :1.000  
##  Mean   :1       Mean   :1       Mean   :1       Mean   :0.929  
##  3rd Qu.:1       3rd Qu.:1       3rd Qu.:1       3rd Qu.:1.000  
##  Max.   :1       Max.   :1       Max.   :1       Max.   :1.000  
##  inrange_th_sat  inrange_th_votes
##  Min.   :0.000   Min.   :4.000   
##  1st Qu.:1.000   1st Qu.:5.000   
##  Median :1.000   Median :5.000   
##  Mean   :0.929   Mean   :4.858   
##  3rd Qu.:1.000   3rd Qu.:5.000   
##  Max.   :1.000   Max.   :5.000

notice the “Mean” value in the summary above is the proportion of ground truth piles that successfully met the spectral index threshold criteria (i.e. piles to be “kept”). it looks like the SAT threshold was most unaligned with these ground truth piles, something we anticipated by looking at the distributions above compared the threshold value recommended in the literature for detecting green vegetation.

let’s look at the proportional distribution of ground truth piles meeting the threshold by individual spectral index

rgb_indices_df %>% 
  sf::st_drop_geometry() %>% 
  dplyr::select(tidyselect::starts_with("inrange_th_")) %>% 
  tidyr::pivot_longer(cols = dplyr::everything()) %>% 
  dplyr::mutate(name = stringr::str_remove_all(name,"inrange_th_") %>% stringr::str_to_upper()) %>% 
  dplyr::filter(name!="VOTES") %>% 
  dplyr::count(name,value) %>% 
  dplyr::group_by(name) %>% 
  dplyr::mutate(
    pct = n/sum(n)
    , value = factor(value, levels = 0:1, labels = c("outside threshold","within threshold"), ordered = T)
    , lab = paste0(scales::percent(pct,accuracy=0.1), "\n(n=", scales::comma(n,accuracy=1), ")")
  ) %>% 
  ggplot2::ggplot(
    mapping = ggplot2::aes(x = value, y = pct, label = lab, fill = value)
  ) +
  ggplot2::geom_col(
    width = 0.6
    , color = NA, alpha = 0.8
  ) +
  ggplot2::geom_text(color = "black", size = 3, vjust = -0.2) +
  ggplot2::scale_fill_viridis_d(option = "magma", begin = 0.3, end = 0.7) +
  ggplot2::scale_y_continuous(
    breaks = seq(0,1,by=0.2)
    , labels = scales::percent
    , expand = ggplot2::expansion(mult = c(0,0.2))
  ) +
  ggplot2::facet_wrap(
    facets = dplyr::vars(name)
    , ncol = 2
  ) +
  ggplot2::labs(
    x = "", y = "", color = "", fill = ""
  ) +
  ggplot2::theme_light() +
  ggplot2::theme(
    legend.position = "none"
    , strip.text = ggplot2::element_text(size = 11, color = "black", face = "bold")
    , axis.text.x = ggplot2::element_text(size = 12)
    , axis.text.y = ggplot2::element_blank()
  )

and let’s look at the distribution of ground truth piles based on the number of individual spectral index thresholds met. we’ll use this count as our voting system.

dplyr::tibble(value=0:5) %>% 
  dplyr::left_join(
    rgb_indices_df %>% 
      sf::st_drop_geometry() %>% 
      dplyr::select(tidyselect::starts_with("inrange_th_votes")) %>% 
      tidyr::pivot_longer(cols = dplyr::everything()) %>% 
      dplyr::mutate(name = stringr::str_remove_all(name,"inrange_th_") %>% stringr::str_to_upper()) %>% 
      dplyr::count(name,value)
    , by = dplyr::join_by(value)
  ) %>% 
  dplyr::mutate(
    n = dplyr::coalesce(n,0)
    , pct = n/sum(n)
    , lab = paste0(scales::percent(pct,accuracy=0.1), "\n(n=", scales::comma(n,accuracy=1), ")")
    , value = factor(value)
  ) %>% 
  ggplot2::ggplot(
    mapping = ggplot2::aes(x = value, y = pct, label = lab, fill = value)
  ) +
  ggplot2::geom_col(
    width = 0.6
    , color = NA, alpha = 0.8
  ) +
  ggplot2::geom_text(color = "black", size = 4, vjust = -0.2) +
  ggplot2::scale_fill_viridis_d(option = "mako", direction=-1) +
  ggplot2::scale_y_continuous(
    labels = scales::percent
    , expand = ggplot2::expansion(mult = c(0,0.15))
  ) +
  ggplot2::labs(
    y = "", x = "spectral index threshold votes", color = "", fill = ""
    , subtitle = "distribution of ground truth piles meeting spectral index thresholds"
  ) +
  ggplot2::theme_light() +
  ggplot2::theme(
    legend.position = "none"
    , axis.text.x = ggplot2::element_text(size = 12)
    , axis.text.y = ggplot2::element_blank()
  )

we can use this spectral index voting system to filter candidate slash piles with a user-defined parameter which defines the sensitivity of filtering based on the spectral information. For example, a value of “5” would heavily weight the spectral information in determining which piles to keep while a value of “1” would put less weight on the spectral data.

let’s see what some of those piles with the fewest votes look like on the imagery

# filter for the plots with the fewest votes
dta_temp <- rgb_indices_df %>% 
  dplyr::filter(inrange_th_votes<=4) %>% 
  dplyr::slice_sample(n=12) %>% 
  sf::st_transform(terra::crs(ortho_rast))
# dta_temp %>% sf::st_drop_geometry() %>% dplyr::select(tidyselect::starts_with("rast_agg")) %>% summary()
# plot on ortho
plts_temp <-
  1:nrow(dta_temp) %>%
  purrr::map(function(x){
    dta <- dta_temp %>% dplyr::slice(x)
    dta <- dta %>% 
      dplyr::bind_cols(
        dta %>% 
          sf::st_drop_geometry() %>% 
          dplyr::select(tidyselect::starts_with("inrange_th_")) %>% 
          tidyr::pivot_longer(cols = dplyr::everything()) %>% 
          dplyr::mutate(name = stringr::str_remove_all(name,"inrange_th_") %>% stringr::str_to_upper()) %>% 
          dplyr::filter(name!="VOTES" & value == 0) %>% 
          dplyr::select(-value) %>% 
          tidyr::pivot_wider(values_from = name) %>% 
          dplyr::rowwise() %>% 
          dplyr::mutate(
            lab = paste(
                dplyr::c_across(dplyr::everything())
                , collapse = ", "
              )
          ) %>% 
          dplyr::select(lab)
      )
    ortho_plt_fn(dta) +
      ggplot2::geom_sf(data = dta, fill = NA, color = "white", lwd = 0.6) +
      ggplot2::labs(subtitle = paste0("Outside Threshold:\n",dta$lab)) + 
      ggplot2::theme(plot.subtitle = ggplot2::element_text(size = 9))  
  })
# combine
patchwork::wrap_plots(
  plts_temp
  , ncol = 4
)

all of those piles would have been voted out based on SAT or BI since they are either in shadows or the dead wood appears very bright/white (perhaps we need to adjust those thresholds?). let’s see a summary of those spectral index values for these piles with the fewest votes

dta_temp %>% sf::st_drop_geometry() %>% dplyr::select(rast_agg_bi, rast_agg_sat) %>% summary()
##   rast_agg_bi       rast_agg_sat    
##  Min.   :0.05384   Min.   :0.03866  
##  1st Qu.:0.07364   1st Qu.:0.04705  
##  Median :0.08453   Median :0.16250  
##  Mean   :0.23367   Mean   :0.16360  
##  3rd Qu.:0.52206   3rd Qu.:0.25809  
##  Max.   :0.57571   Max.   :0.37621

7.3 Candidate Polygon Spectral Filtering Function

let’s put all of this together to define a function that takes as input: 1) a spatial data frame of candidate polygons; 2) a raster with RGB spectral data; 3) user-defined spectral weighting (voting system)

polygon_spectral_filtering <- function(
  sf_data
  , rgb_rast
  # define the band index
  , red_band_idx
  , green_band_idx
  , blue_band_idx
  # spectral weighting
  , spectral_weight = 3
) {
  ### could make these parameters
  th_rgri <- c((0.7+0.01),Inf) # increase each by 0.01 since we'll be checking lower<=x<=upper
  th_vdvi <- c(-Inf,(0.03+0.01)) # increase each by 0.01 since we'll be checking lower<=x<=upper
  th_exgr = c(-Inf,0)
  th_bi <- c(0.1,0.9)
  th_sat <- c(0.05,Inf)
  
  # checks
  if(!inherits(rgb_rast, "SpatRaster")){
    stop("Input `rast` must be a SpatRaster object.")
  }
  if(!inherits(sf_data, "sf")){
    stop("Input `sf_data` must be an sf data frame.")
  }
  if(!all(sf::st_geometry_type(sf_data) %in% c("POLYGON", "MULTIPOLYGON"))) {
    stop("Input `sf_data` must contain polygon geometries.")
  }
  spectral_weight <- as.numeric(spectral_weight)
  if( !(spectral_weight %in% c(0:5)) ){
    stop("Input `spectral_weight` must be a number between 0 (no filtering based on spectral) and 5 (highest weighting of spectral data)")
  }
  # if you don't want to do it, then why do it?
  if(spectral_weight==0){
    return(sf_data)
  }
  ##################################################
  # calculate_all_rgb_indices
  ##################################################
  all_rgb_indices_rast <- calculate_all_rgb_indices(
    raster_obj = rgb_rast
    , red_band_idx = red_band_idx
    , green_band_idx = green_band_idx
    , blue_band_idx = blue_band_idx
  )
  ##################################################
  # extract_rast_values
  ##################################################
  rgb_indices_df <- extract_rast_values(
    sf_data = sf_data %>% dplyr::ungroup()
    , rast = all_rgb_indices_rast
    , fun_agg = median
  )
  ##################################################
  # rgb_indices_threshold_voting
  ##################################################
  rgb_indices_df <- rgb_indices_threshold_voting(
    rgb_indices_df=rgb_indices_df
    , th_rgri = th_rgri
    , th_vdvi = th_vdvi
    , th_exgr = th_exgr
    , th_bi = th_bi
    , th_sat = th_sat
  )
  ##################################################
  # filtering
  ##################################################
  rgb_indices_df <- rgb_indices_df %>% dplyr::filter(inrange_th_votes>=spectral_weight)
  # return
  return(rgb_indices_df)
}
# polygon_spectral_filtering(
#   sf_data = slash_piles_polys
#   , rgb_rast = ortho_rast
#   , red_band_idx = 1
#   , green_band_idx = 2
#   , blue_band_idx = 3
#   , spectral_weight = 4
# ) %>%
# nrow()
# # dplyr::glimpse()
# nrow(slash_piles_polys)