Transfer Learning from Sound Directories — deploy_CNN

This function processes sound data from a specified directory, performs transfer learning using a pre-trained deep learning model, and saves the results.

deploy_CNN_multi(
  output_folder,
  output_folder_selections,
  output_folder_wav,
  top_model_path,
  path_to_files,
  windowlength = 512,
  detect_pattern = NA,
  architecture,
  clip_duration = 12,
  hop_size = 6,
  downsample_rate = 16000,
  threshold = 0.1,
  save_wav = TRUE,
  class_names = c("female.gibbon", "hornbill.helmeted", "hornbill.rhino", "long.argus",
    "noise"),
  noise_category = "noise",
  min_freq_khz = 0.4,
  max_freq_khz = 2,
  single_class = TRUE,
  single_class_category = "female.gibbon",
  for_prrec = TRUE
)

Arguments

output_folder: A character string specifying the path to the output folder where the results will be saved.
output_folder_selections: A character string specifying the path to the folder where selection tables will be saved.
output_folder_wav: A character string specifying the path to the folder where extracted WAV files will be saved.
top_model_path: A character string specifying the path to the pre-trained top model for classification.
path_to_files: A character string specifying the path to the directory or list containing sound files to process.
windowlength: window length for input into 'spectro' function from seewave. Deafults to 512.
detect_pattern: Pattern in sound file to detect for subset.
architecture: User specified: 'alexnet', 'vgg16', 'vgg19', 'resnet18', 'resnet50', or 'resnet152'
clip_duration: The duration of each sound clip in seconds.
hop_size: The hop size for splitting the sound clips.
downsample_rate: The downsample rate for audio in Hz, set to 'NA' if no downsampling is required.
threshold: The threshold for audio detection.
save_wav: A logical value indicating whether to save the extracted sound clips as WAV files.
class_names: A character vector containing the unique classes for training the model.
noise_category: A character string specifying the noise category for exclusion.
min_freq_khz: The minimum frequency in kHz for spectrogram visualization.
max_freq_khz: The maximum frequency in kHz for spectrogram visualization.
single_class: A logical value indicating whether to process only a single class. For now 'TRUE' is only option.
single_class_category: A character string specifying the single class category when 'single_class' is set to TRUE.
for_prrec: Whether to output all detections to create a PR curve.

Value

Returns spectrogram images, wav files (if specified), and Raven selection tables for each sound file in specified output directory.

Details

This function processes sound data from a directory, extracts sound clips, converts them to images, performs image classification using a pre-trained deep learning model, and saves the results including selection tables and image and audio files.

Note

This function takes a model trained using the 'train_CNN_multi' function.

Examples

{
  # Load data
  data("TempBinWav")

  # Create necessary directories
  dir.create(file.path(tempdir(), "MultiDir", "Wav"), recursive = TRUE, showWarnings = FALSE)

  # Write to temp directory
  writeWave(TempBinWav, filename = file.path(tempdir(), "MultiDir", "Wav", "TempBinWav.wav"))

  # Set model directory
  trained_models_dir <- system.file("extdata", "trainedresnetmulti", package = "gibbonNetR")

  # Specify model path
  ModelPath <- list.files(trained_models_dir, full.names = TRUE)

  # Deploy trained model over sound files
  deploy_CNN_multi(
    clip_duration = 12,
    architecture = "resnet18",
    output_folder = file.path(tempdir(), "MultiDir", "Results", "Images"),
    output_folder_selections = file.path(tempdir(), "MultiDir", "Results", "Selections"),
    output_folder_wav = file.path(tempdir(), "MultiDir", "Results", "Wavs"),
    detect_pattern = NA,
    top_model_path = ModelPath,
    path_to_files = file.path(tempdir(), "MultiDir", "Wav"),
    downsample_rate = "NA",
    save_wav = FALSE,
    class_names = c("female.gibbon", "hornbill.helmeted", "hornbill.rhino", "long.argus", "noise"),
    noise_category = "noise",
    single_class = FALSE,
    threshold = 0.25,
    max_freq_khz = 2
  )
}
#> Warning: '/var/folders/1s/x8xb37tj45j86tn_stc4v44w0000gn/T//RtmpScZf01/MultiDir/Results/Images' already exists
#> Warning: '/var/folders/1s/x8xb37tj45j86tn_stc4v44w0000gn/T//RtmpScZf01/MultiDir/Results/Selections' already exists
#> Warning: '/var/folders/1s/x8xb37tj45j86tn_stc4v44w0000gn/T//RtmpScZf01/MultiDir/Results/Wavs' already exists
#> 1 out of 1
#> TempBinWav.wav
#> saving sound clips
#> Creating images 1 start time clips
#> Classifying images using Top Model
#> Warning: Some torch operators might not yet be implemented for the MPS device. A
#> temporary fix is to set the `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a
#> fall back for those operators:
#> ℹ Add `PYTORCH_ENABLE_MPS_FALLBACK=1` to your `.Renviron` file, for example use
#>   `usethis::edit_r_environ()`.
#> ✖ Using `Sys.setenv()` doesn't work because the env var must be set before R
#>   starts.
#> 2222
#> Saving output
#> Saving output to /var/folders/1s/x8xb37tj45j86tn_stc4v44w0000gn/T//RtmpScZf01/MultiDir/Results/Images/___TopModel_.jpg
#> 
#> NANANANANANANANANANA
#> Saving Selection Table No Detections 
#> 0.455532073974609