Transfer Learning from Sound Directories — deploy_CNN

This function processes sound data from a specified directory, performs transfer learning using a pre-trained deep learning model, and saves the results.

deploy_CNN_binary(
  output_folder,
  output_folder_selections,
  output_folder_wav,
  top_model_path,
  path_to_files,
  windowlength = 512,
  detect_pattern = NA,
  architecture,
  clip_duration = 12,
  hop_size = 6,
  downsample_rate = 16000,
  threshold = 0.5,
  save_wav = TRUE,
  positive.class = "Gibbons",
  negative.class = "Noise",
  min_freq_khz = 0.4,
  max_freq_khz = 2
)

Arguments

output_folder: A character string specifying the path to the output folder where the results will be saved.
output_folder_selections: A character string specifying the path to the folder where selection tables will be saved.
output_folder_wav: A character string specifying the path to the folder where extracted WAV files will be saved.
top_model_path: A character string specifying the path to the pre-trained top model for classification.
path_to_files: A character string specifying the path to the directory containing sound files to process.
windowlength: window length for input into 'spectro' function from seewave. Deafults to 512.
detect_pattern: (optional) A character string specifying a pattern to detect in the file names. Default is NA.
architecture: A character string specifying the architecture of the pre-trained model.
clip_duration: The duration of each sound clip in seconds.
hop_size: The hop size for splitting the sound clips.
downsample_rate: The downsample rate for audio in Hz. Set to 'NA' if no downsampling is required.
threshold: The threshold for audio detection.
save_wav: A logical value indicating whether to save the extracted sound clips as WAV files. Default is TRUE.
positive.class: A character string specifying the positive class label. Default is 'Gibbons'.
negative.class: A character string specifying the negative class label. Default is 'Noise'.
min_freq_khz: The minimum frequency in kHz for spectrogram visualization.
max_freq_khz: The maximum frequency in kHz for spectrogram visualization. Default is 2.

Value

Returns spectrogram images, wav files (if specified), and Raven selection tables for each sound file in specified directory.

Details

This function processes sound data from a directory, extracts sound clips, converts them to images, performs image classification using a pre-trained deep learning model, and saves the results including selection tables and image and audio files.

Note

This function requires a model trained using the 'train_CNN_binary' function as input.

Examples

{
  #' Load data
  data("TempBinWav")

  #' Create necessary directories
  dir.create(file.path(tempdir(), "BinaryDir", "Wav"), recursive = TRUE, showWarnings = TRUE)

  #' Write to temp directory
  writeWave(TempBinWav, filename = file.path(tempdir(), "BinaryDir", "Wav", "TempBinWav.wav"))

  #' Set model directory
  trained_models_dir <- system.file("extdata", "trainedresnetbinary", package = "gibbonNetR")

  #' Specify model path
  ModelPath <- list.files(trained_models_dir, full.names = TRUE)

  #' Run model
  deploy_CNN_binary(
    clip_duration = 12,
    architecture = "alexnet",
    output_folder = file.path(tempdir(), "BinaryDir", "Results", "Images"),
    output_folder_selections = file.path(tempdir(), "BinaryDir", "Results", "Selections"),
    output_folder_wav = file.path(tempdir(), "BinaryDir", "Results", "Wavs"),
    detect_pattern = NA,
    top_model_path = ModelPath,
    path_to_files = file.path(tempdir(), "BinaryDir", "Wav"),
    downsample_rate = "NA",
    threshold = 0.5,
    save_wav = FALSE,
    positive.class = "Gibbons",
    negative.class = "Noise",
    max_freq_khz = 2
  )
}
#> Warning: '/var/folders/1s/x8xb37tj45j86tn_stc4v44w0000gn/T//RtmpScZf01/BinaryDir/Wav' already exists
#> 1 out of 1
#> TempBinWav.wav
#> saving sound clips
#> Creating images 1 start time clips
#> Classifying images using Top Model
#> Warning: Some torch operators might not yet be implemented for the MPS device. A
#> temporary fix is to set the `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a
#> fall back for those operators:
#> ℹ Add `PYTORCH_ENABLE_MPS_FALLBACK=1` to your `.Renviron` file, for example use
#>   `usethis::edit_r_environ()`.
#> ✖ Using `Sys.setenv()` doesn't work because the env var must be set before R
#>   starts.
#> Saving output
#> Saving output
#> Saving Selection Table /var/folders/1s/x8xb37tj45j86tn_stc4v44w0000gn/T//RtmpScZf01/BinaryDir/Results/SelectionsGibbons-__TempBinWav.wavTopModelBinary.txt
#> 0.506283044815063