\documentclass{article} \usepackage[margin = 1.2in]{geometry} \usepackage{float} \usepackage[colorlinks = true, urlcolor = blue]{hyperref} % Must be loaded as the last package \hypersetup{colorlinks} %\VignetteEngine{knitr::knitr} %\VignetteIndexEntry{monitoR Quick Start} \setlength{\parindent}{0pt} \setlength{\parskip}{5pt plus 1pt minus 1pt} \title{A short introduction to acoustic template matching with \texttt{monitoR}} \author{Sasha D. Hafner \\ \small{Aarhus University, Aarhus, Denmark (\texttt{sasha.hafner@eng.au.dk})} \vspace{3mm}\\ and \vspace{3mm}\\ Jonathan Katz \\ \small{Vermont Cooperative Fish and Wildlife Research Unit, University of Vermont, USA (\texttt{jonkatz4@gmail.com})} } \begin{document} \maketitle <>= options(useFancyQuotes = FALSE) library(knitr) opts_chunk$set(comment = "#", out.height = "4.0in", out.width = "4.0in", fig.height = 5.2, fig.width = 5.2, fig.align = "center", cache = FALSE) options(tidy = TRUE, width = 65) @ <>= library(tuneR) library(monitoR) #sf <- list.files('/home/sasha/Dropbox/UVMacoustic/monitoR/R', full.names = TRUE) #sf <- sf[!grepl("eventEval.R", sf)] #for(ff in sf) source(ff) #df <- list.files('/home/sasha/Dropbox/UVMacoustic/monitoR/data', full.names = TRUE) #for(ff in df) load(ff) @ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section*{Disclaimer} ``Although this software program has been used by the U.S. Geological Survey (USGS), no warranty, expressed or implied, is made by the USGS or the U.S. Government as to the accuracy and functioning of the program and related program material nor shall the fact of distribution constitute any such warranty, and no responsibility is assumed by the USGS in connection therewith.'' \section{Getting started} \label{sec:example1} The motivation behind the development of the \texttt{monitoR} package was automated detection and identification of animal vocalizations. Here, we describe how to use \texttt{monitoR} functions to detect and identify songs from two bird species from a survey recording. We'll start by loading the package. <>= library(monitoR) @ The \texttt{monitoR} package includes a 23 second recording of songbirds as a Wave object (defined in the \texttt{tuneR} package) named \texttt{survey}. <<>>= data(survey) survey @ A spectrogram of the recordings can be generated using the \texttt{viewSpec} function\footnote{ By default, this function just transforms the data and generates a spectrogram. Try setting the \texttt{interactive} argument to \texttt{TRUE} for more viewing options that are useful for longer recordings. }. <>= viewSpec(survey) @ We could also play the recording using the \texttt{play} function from the tuneR package\footnote{ This requires that a command-line wav player is installed. Here, we are using SoX. }. <>= setWavPlayer("play") play(survey) @ You might be see two different types of songs here. The songs around 2, 10, and maybe 22 seconds are from a black-throated green warbler \textit{Setophaga virens}, and there are ovenbird \textit{Seiurus aurocapilla} songs around 6, 9, 15, and 23 seconds. In the following, we show how \texttt{monitoR} can be used to detect and identify these songs. The \texttt{monitoR} package also includes short recordings of songs from the same two species in \texttt{survey}: \texttt{btnw}, which contains a single song from a black-throated green warbler, and \texttt{oven} which contains a single song from an ovenbird. As with \texttt{survey}, these objects are \texttt{Wave} objects. <<>>= data(btnw) data(oven) btnw oven @ <<>>= viewSpec(btnw) @ <<>>= viewSpec(oven) @ As \texttt{Wave} objects, all of these recordings can be used directly in the template matching functions\footnote{ But you must set the \texttt{write.wav} argument to \texttt{TRUE} to do this. }. Users will typically work with wav files, however, as opposed to mp3 files, or \texttt{Wave} objects. To be consistent with this typical approach, we will save the two clips and the survey as wav files\footnote{ Here, we will use a temporary directory, but for your own recordings a more permanent location is better. }. We'll include a made-up date in the name for the survey file, since that is one way to determine ``absolute'' times for detections\footnote{ The other option is to use the modification date of the survey file, which clearly would be wrong here, since we just created it. }. <<>>= btnw.fp <- file.path(tempdir(), "btnw.wav") oven.fp <- file.path(tempdir(), "oven.wav") survey.fp <- file.path(tempdir(), "survey2010-12-31_120000_EST.wav") writeWave(btnw, btnw.fp) writeWave(oven, oven.fp) writeWave(survey, survey.fp) @ \section{Spectrogram cross correlation} \label{sec:corMatch} The \texttt{monitoR} package provides functions for two kinds of template detection--spectrogram cross correlation and binary point matching. We'll start with the correlation approach, which is slightly simpler. The easiest way to create a correlation template is to call up \texttt{makeCorTemplate} with only the \texttt{clip} argument specified. This function carries out a Fourier transform, and, by default, saves amplitude (volume) data for each cell within the resulting frequency-by-time matrix. <<>>= wct1 <- makeCorTemplate(btnw.fp) @ The \texttt{makeCorTemplate} function displays a spectrogram of the recording\footnote{ This plot is more important when templates are produced interactively, as described below. }, and shows the cells included in the template (all the cells here) in transparent purple\footnote{ This color can be selected with the argument \texttt{sel.col}. }. Let's look at a summary of the template we just created\footnote{ Just entering the template list name will call up the appropriate \texttt{show} method, which just displays a simple summary. }. <<>>= wct1 @ The \texttt{w1} object is a template list, of class \texttt{corTemplateList}. Here the idea of a list doesn't make a lot of sense, since we only have one template, but in monitoR, templates only exist as part of template lists. Below, we'll create template lists with multiple templates. The name associated with our single template in the list is \texttt{A}, which is the default, and is not very descriptive. The name of a template can be set when it is created, using the \texttt{name} argument\footnote{ The \texttt{templateNames} function can be used to change template names in an existing template list. For more detailed information, the \texttt{templateComment} function can be used to check and add comments to templates. }\footnote{ In the interest of keeping this file from being too big, the figures produced by the calls in most of the following examples will not be shown. For a complete version of this vignette, visit the \href{http://www.uvm.edu/rsenr/vtcfwru/R/?Page=monitoR/monitoR.htm}{monitoR website}. }. <>= wct1 <- makeCorTemplate(btnw.fp, name = "w1") wct1 @ This template is functional, but it does not take advantage of all the options available in \texttt{makeCorTemplate}. Setting time and frequency limits is one of simplest, and may be necessary when making a template from a long recording. We'll make a new template that uses the \texttt{t.lim} and \texttt{frq.lim} arguments to focus on a particular section within the song. <<>>= wct2 <- makeCorTemplate(btnw.fp, t.lim = c(1.5, 2.1), frq.lim = c(4.2, 5.6), name = "w2") wct2 @ Let's make two templates with the ovenbird recording as well. <>= oct1 <- makeCorTemplate(oven.fp, t.lim = c(1, 4), frq.lim = c(1, 11), name = "o1") @ For the next template, we'll use the \texttt{dens} argument to reduce the size of the template--that is, the number of points included. <>= oct2 <- makeCorTemplate(oven.fp, t.lim = c(1, 4), frq.lim = c(1, 11), dens = 0.1, name = "o2") @ Here, our template will include \textit{approximately} one-tenth of all the spectrogram points. Template lists with just a single template, can be used for detection. However, \texttt{monitoR} is designed to use template lists that contain multiple templates. Templates can be combined together in a single list with \texttt{combineCorTemplates}. <<>>= ctemps <- combineCorTemplates(wct1, wct2, oct1, oct2) ctemps @ The output from \texttt{combineCorTemplates} is a template list, just like the output from \texttt{makeCorTemplate}. This function can be used to combine templates present in any number of correlation template lists (and each of the lists can contain any number of templates). We can view all the templates in \texttt{ctemps} with the \texttt{plot} function. <>= plot(ctemps) @ %<>= %plot(ctemps, ask = FALSE) %@ We can now use our templates to look for black-throated green warbler and ovenbird songs within the \texttt{survey} recording. This is a three-step process: we first calculate correlation score between the templates and each time bin in the survey, then identify ``peaks'' in the scores, and finally determine which, if any, score peaks exceed the score cutoff. Correlation scores are calculated with \texttt{corMatch}. <<>>= cscores <- corMatch(survey.fp, ctemps) @ This particular example should run in a few seconds, but \texttt{corMatch} can take much longer for long surveys \footnote{ Set the \texttt{parallel} argument to \texttt{TRUE} to speed this step up if you are have a multi-core processor and are not using Windows. } . We can get a summary of the \texttt{scores} object by entering its name, but it requires additional processing in order to be used for detecting songs. <<>>= cscores @ The \texttt{corMatch} function returns \texttt{templateScores} objects, which contain the correlation scores for each template, but also the complete original templates, along with other information, like the name of the survey file used. To make detections, we need to look within the correlation scores returned by \texttt{corMatch} for peaks, and then identify those peaks with values above the score cutoff as detections. These two steps are carried out with \texttt{findPeaks}. <<>>= cdetects <- findPeaks(cscores) @ The \texttt{findPeaks} function returns \texttt{detectionList} objects, which contain all the information originally present in \texttt{templateScores} objects, plus detection results. To see a summary of the detections, just enter the name of the object. <<>>= cdetects @ From the \texttt{n.peaks} column, we can see there are from five to 38 peaks per template, and from the \texttt{n.detections} column, we can see that the templates resulted in from one to three detections. The detections can be extracted with the \texttt{getDetections} function. <<>>= getDetections(cdetects) @ Whether or not a peak qualifies as a detection depends a parameter called the score cutoff, which can be set separately for each template. If we look at our template list again, we can see the score cutoffs given in the last column of the summary. <<>>= ctemps @ They are currently set to the default value, but there is no reason to think that this value should work for all or even any templates. Instead, score cutoffs need to be determined based on template performance, typically with a subset of the survey or surveys that will ultimately be searched for matching sounds. Here we just have one, very short, survey, so we'll use the entire survey to determine what the score cutoff should be set to for each template. We can see a graphical summary of the results with the generic \texttt{plot} function. <>= plot(cdetects) @ Starting with the warbler templates, it is easy to see that template \texttt{w2} is much better than template \texttt{w1}--its scores are high for the three apparent songs, and relatively low elsewhere, including during ovenbird songs. Changing the score cutoff cannot much improve template \texttt{w1}. For template \texttt{w2}, the default value works well enough for this survey. But to incorporate a safety factor, we might want to lower it, perhaps to 0.30. Looking at the ovenbird results, both templates perform similarly, but if we want to detect songs like the faint one around 15 sec, the score cutoff needs to be reduced, perhaps to 0.20. We could recreate this template list by repeating the steps above, but set the \texttt{score.cutoff} argument for \texttt{w2} to 0.30 and 0.20 for \texttt{o1} and \texttt{o2}. Fortunately, using \texttt{templateCutoff} is a more efficient option. It can be used as an extractor or replacement function. <<>>= templateCutoff(ctemps) @ To change the cutoff for templates w2, o1, and o2 in the the template list \texttt{cdetects}, we can use: <<>>= templateCutoff(ctemps)[2:4] <- c(0.3, 0.2, 0.2) @ Or, this call does the same thing (and is perhaps less prone to mistakes): <<>>= templateCutoff(ctemps) <- c(w2 = 0.3, o1 = 0.2, o2 = 0.2) ctemps @ Or, we could even use the special name \texttt{default} to specify a default value. <<>>= templateCutoff(ctemps) <- c(w2 = 0.3, default = 0.2) ctemps @ And now we could call \texttt{corMatch} and \texttt{findPeaks} again. This approach would work fine, but takes more time and effort than is needed, because it isn't necessary to recalculate the scores, since they are independent of the cutoffs. Instead, we can just change the score cutoffs in our existing \texttt{detectionList} object \texttt{cdetects}, with the same function \texttt{templateCutoff}. The detections are automatically updated when score cutoffs are changed in a \texttt{detectionList} object. <<>>= templateCutoff(cdetects) <- c(w2 = 0.3, default = 0.2) @ So it isn't even necessary to change \texttt{ctemps}\footnote{ Of course, you may want to reuse or save the template list with optimized cutoffs. The \texttt{getTemplates} function can be helpful here--it can be used to extract the templates from a \texttt{detectionList} object. }. Now let's take a look at \texttt{cdetects}. <<>>= cdetects @ Notice how the number of detections has changed for \texttt{o1} and \texttt{o2}\footnote{ The plot produced by the call below is omitted, but it would now show a different number of detections. }. <>= plot(cdetects) @ Since the template \texttt{w1} is nearly useless, and template \texttt{o1} and \texttt{o2} nearly identical, we might want to drop \texttt{w1} and either \texttt{o1} or \texttt{o2} from our results. This can be done using indexing\footnote{ Here, the omitted plot would only show results for w2 and o2. }. <<>>= cdetects <- cdetects[c("w2", "o2")] cdetects @ <>= plot(cdetects) @ In this short example, it is easy to verify results in the results using the above plot. For longer surveys, for which these methods make more sense, the \texttt{showPeaks} function is more efficient. It can be used to view all detections or even all peaks, one by one, and by setting the \texttt{verify} argument to \texttt{TRUE}, it can be used to save results of the verification process. This simple example doesn't show all the available options for creating templates. In particular, with the \texttt{select} argument set to \texttt{"rectangle"}, or \texttt{"cell"}, it is possible to select areas or even individual cells to be included in a template. To summarize, for template matching using spectrogram cross-correlation one would typically use the following functions in the order given: \begin{enumerate} \item \texttt{makeCorTemplate} to make the template(s) \item \texttt{combineCorTemplates} to combine templates together in a single template list \item \texttt{corMatch} to calculate scores \item \texttt{findPeaks} to find peaks and identify detections \item \texttt{plot} to see the scores and detections \item \texttt{getDetections} to get the (numeric) detection results \item \texttt{templateCutoff} to change template cutoffs in the detection list object (iteratively with \texttt{plot} and \texttt{getDetections}) \end{enumerate} \section{Binary point matching} \label{sec:binMatch} We can use binary point matching to carry out a procedure similar to the one described above in Section \ref{sec:corMatch}. Because the type of data needed to calculate scores differ, binary templates differ from correlation templates, and the corresponding functions used for creating templates also differ in some ways. The \texttt{makeBinTemplate} function converts the time-domain data into a binary format by default--i.e., cells are identified as either ``high'' or ``low'' (we'll refer to these as ``on'' and ``off'') depending on whether they are greater than or less than the user-set \texttt{amp.cutoff}, respectively. The value of \texttt{amp.cutoff} is set interactively by default \footnote{ This can be changed with the \texttt{amp.cutoff} argument. }. With the default option, the spectrogram is updated each time the value of \texttt{amp.cutoff} changes. Here, we'll set the cutoff directly in the function call, based on earlier trials. <<>>= wbt1 <- makeBinTemplate(btnw.fp, amp.cutoff = -40, name = "w1") @ Time and frequency limits can be set as with \texttt{makeCorTemplate}. We'll also include a buffer around the ``on'' cells--cells in the buffer are excluded from the template\footnote{ Although the resulting plot isn't shown here, again, to keep this file size down. }. <>= wbt2 <- makeBinTemplate(btnw.fp, amp.cutoff = -30, t.lim = c(1.5, 2.1), frq.lim = c(4.2, 5.6), buffer = 2, name = "w2") @ The \texttt{amp.cutoff} and \texttt{buffer} values used above are based on trial-and-error not shown here. Let's also make two ovenbird templates, with and without a buffer. <>= obt1 <- makeBinTemplate(oven.fp, amp.cutoff = -20, t.lim = c(1, 4), frq.lim = c(1, 11), name = "o1") obt2 <- makeBinTemplate(oven.fp, amp.cutoff = -17, t.lim = c(1, 4), frq.lim = c(1, 11), buffer = 2, name = "o2") @ Binary templates are combined with \texttt{combineBinTemplates}. <<>>= btemps <- combineBinTemplates(wbt1, wbt2, obt1, obt2) btemps @ Notice how binary templates have ``on'' points and ``off'' points, while correlation templates only have one type of point. Scores are calculated using \texttt{binMatch}, which is analogous to \texttt{corMatch}. As with \texttt{corMatch}, the output from this function is a \texttt{templateScores} object. <<>>= bscores <- binMatch(survey.fp, btemps) @ From this point on, objects created by binary point matching are identical to those made with spectrogram cross correlation. To find detections, use the same \texttt{findPeaks} function that we used above. <<>>= bdetects <- findPeaks(bscores) @ And, as above, we can use the \texttt{plot} method for \texttt{detectionList} objects to view detections. <>= plot(bdetects) @ Based on these results, it looks like all templates except w1 could be useful with the right score cutoff. We might want to use a cutoff of about 7 for \texttt{w2}, and perhaps 4 for \texttt{o1} and \texttt{o2} if the song around 15 seconds should be detected. First, let's drop \texttt{w1}\footnote{ The plot that is omitted below would no longer show any results for w1. }. <<>>= bdetects <- bdetects[-1] @ <<>>= templateCutoff(bdetects) <- c(w2 = 7, default = 4) @ <>= plot(bdetects) @ Finally, the detections are: <<>>= getDetections(bdetects) @ As with correlation templates, there is much more flexibility in making binary templates than this example suggests. In particular, options for the \texttt{select} should be explored. The most flexible is \texttt{"cell"}, but it is certainly not the quickest and because individual cells are selected manually, may not be readily repeatable. To summarize, for template matching using binary point matching one would typically use the following functions in the order given: \begin{enumerate} \item \texttt{makeBinTemplate} to make the template(s) \item \texttt{combineBinTemplates} to combine templates together in a single template list \item \texttt{binMatch} to calculate scores \item \texttt{findPeaks} to find peaks and identify detections \item \texttt{plot} to see the scores and detections \item \texttt{getDetections} to get the (numeric) detection results \item \texttt{templateCutoff} to change template cutoffs in the detection list object (iteratively with \texttt{plot} and \texttt{getDetections}) \end{enumerate} \section{Other functions} \label{sec:otherfunctions} The \texttt{monitoR} package includes several functions not described in this short introduction. \begin{itemize} \item For manipulating recordings directly, there are the \texttt{cutWave}\footnote{Originally named \texttt{cutw}, but recently changed.}, \texttt{changeSampRate}, and \texttt{mp3Subsamp} functions. \item With the \texttt{plot} methods for template list and detection list objects (used in sections \ref{sec:corMatch} and \ref{sec:binMatch} above), color palettes and other characteristics of the plots can be modified. \item The \texttt{viewSpec} function is capable of much more than the above examples suggest. It can be used interactively to view different parts of a spectrogram, or even annotate it. \item Comments can be added to templates within a template list using the \texttt{templateComment} functions. \item Binary and correlation templates were designed to be portable, and can be saved to text files using \texttt{writeCorTemplates} or \texttt{writeBinTemplates}. Existing template files can be read in with \texttt{readCorTemplates} and \texttt{writeCorTemplates}. \item For verification of detections, there is the \texttt{showPeaks} function, which, shows (or plays) individual detections (or peaks), and, in the interactive mode, allows the user to add verification data to a detection list object. \item The functions \texttt{collapseClips} and \texttt{bindEvents} can be used to extract and bind together detections from a longer recording, and could also be useful for verification. \item The function \texttt{eventEval} provides a way to compare results from template detection to a manually annotated spectrogram, and the function \texttt{timeAlign} removes redundant detections from a survey (e.g. those from multiple templates). \item The \texttt{compareTemplates} function can be used to compare the performance of multiple templates and evaluate similarity between templates. \item An alternative to writing \texttt{templateLists} and \texttt{detectionLists} locally is to store templates, survey metadata, and detection results in a MySQL database. MonitoR has a variety of MySQL queries using package \texttt{RODBC} to push and pull data to an acoustics database: uploads are accomplished with \texttt{dbUploadSurvey}, \texttt{dbUploadResult}, and \texttt{dbUploadTemplate}. Templates can be downloaded to \texttt{templateList} objects with \texttt{dbDownloadTemplate}, and results can be downloaded to \texttt{detectionList} objects with \texttt{dbDownloadResult}. Survey metadata and media card/recorder metadata are downloaded with \texttt{dbDownloadSurvey} and \texttt{dbDownloadCardRecorderID}. An acoustics database schema is available for download at the \href{http://www.uvm.edu/rsenr/vtcfwru/R/?Page=monitoR/monitoR.htm}{monitoR website}, which can be loaded on an active database instance using \texttt{dbSchema} or incorporated into an existing MySQL database. \end{itemize} \section*{Acknowledgements} Generous funding for this work was provided by the National Park Service, the U.S. Geological Survey, and the National Phenology Network. \end{document}