% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/das_sight.R
\name{das_sight}
\alias{das_sight}
\alias{das_sight.data.frame}
\alias{das_sight.das_df}
\title{DAS sightings}
\usage{
das_sight(x, ...)

\method{das_sight}{data.frame}(x, ...)

\method{das_sight}{das_df}(
  x,
  return.format = c("default", "wide", "complete"),
  return.events = c("S", "K", "M", "G", "s", "k", "m", "g", "t", "p", "F"),
  ...
)
}
\arguments{
\item{x}{an object of class \code{das_df},
or a data frame that can be coerced to class \code{das_df}}

\item{...}{ignored}

\item{return.format}{character; can be one of "default", "wide", "complete",
or any partial match thereof (case sensitive). Formats described below}

\item{return.events}{character; event codes included in the output.
Must be one or more of: "S", "K", "M", "G", "s", "k", "m", "g", "t", "p", "F"
(case-sensitive). The default is all of these event codes}
}
\value{
Data frame with 1) the columns from \code{x}, excluding the 'Data#' columns,
  and 2) columns with sighting information extracted from 'Data#' columns.
  See \code{\link{das_format_pdf}} for more information the sighting information.
  If \code{return.format} is "default", then there is one row for each species of each sighting event;
  if \code{return.format} is "wide", then there is one row for each sighting event;
  if \code{return.format} is "complete", then there is one row for every
  group size estimate for each sighting event (excluding sperm whale "C" events - see the Details section).

  The format-specific columns are described in their respective sections.
  The following sighting information columns are included in all return formats:

  \tabular{lll}{
    \emph{Sighting information}                \tab \emph{Column name} \tab \emph{Notes} \cr
    Sighting number                            \tab SightNo      \tab Character                                                                              \cr
    Subgroup code                              \tab Subgroup     \tab Character                                                                              \cr
    Daily sighting number                      \tab SightNoDaily \tab See below                                                                              \cr
    Observer that made the sighting            \tab Obs          \tab                                                                                        \cr
    Standard observer                          \tab ObsStd       \tab Logical; \code{TRUE} if Obs is one of ObsL, Rec or ObsR, and \code{FALSE} otherwise    \cr
    Bearing to the sighting                    \tab Bearing      \tab Numeric; degrees, expected range 0 to 360                                              \cr
    Number of reticle marks                    \tab Reticle      \tab Numeric                                                                                \cr
    Distance (nautical miles)                  \tab DistNm       \tab Numeric                                                                                \cr
    Sighting cue                               \tab Cue          \tab                                                                                        \cr
    Sighting method                            \tab Method       \tab                                                                                        \cr
    Photos of school?                          \tab Photos       \tab                                                                                        \cr
    Birds present with school?                 \tab Birds        \tab                                                                                        \cr
    Calibration school?                        \tab CalibSchool  \tab                                                                                        \cr
    Aerial photos taken?                       \tab PhotosAerial \tab                                                                                        \cr
    Biopsy taken?                              \tab Biopsy       \tab                                                                                        \cr
    Probable sighting                          \tab Prob         \tab Logical indicating if sighting has associated ? event; \code{NA} for non-S/K/M/G events\cr
    Number of species in sighting              \tab nSp          \tab \code{NA} for non-S/K/M/G events                                                       \cr
    Mixed species sighting                     \tab Mixed        \tab Logical; \code{TRUE} if nSp > 1                                                        \cr
    Group size of school - best estimate       \tab GsSchoolBest \tab See below                                                                              \cr
    Group size of school - high estimate       \tab GsSchoolHigh \tab See below                                                                              \cr
    Group size of school - low estimate        \tab GsSchoolLow  \tab See below                                                                              \cr
    Course (true heading) of school at resight \tab CourseSchool \tab \code{NA} for non-s/k/m events                                                         \cr
    Presence of associated JFR                 \tab TurtleJFR    \tab \code{NA} for non-"t" events; JFR = jellyfish, floating debris, or red tide            \cr
    Estimated turtle maturity                  \tab TurtleAge    \tab \code{NA} for non-"t" events                                                           \cr
    Perpendicular distance (km) to sighting    \tab PerpDistKm   \tab Calculated via \code{(abs(sin(Bearing*pi/180) * DistNm) * 1.852)}
  }

  SightNoDaily is a running count of the number of S/K/M/G sightings that occurred on each day.
  It is formatted as 'YYYYMMDD'_'running count', e.g. "20050101_1".

  The GsSchoolBest, GsSchoolHigh, and GsSchoolLow columns are either:
  1) the arithmetic mean across observer estimates, for the "default" and "wide" formats, or
  2) the individual observer estimates, for the "complete" format.
  Note that for non-"complete" formats, \code{na.rm = TRUE} is used when calculating the mean,
  and thus blank elements of estimates (but not the whole incomplete estimate) are ignored.

  To convert the perpendicular distance back to nautical miles,
  one would divide PerpDistKm by 1.852
}
\description{
Extract sightings and associated information from processed DAS data
}
\details{
DAS events contain specific information in the 'Data#' columns,
  with the information depending on the event code for that row.
  The output data frame contains columns with this specific information
  extracted to dedicated columns as described below.
  This function recognizes the following types of sightings:
  marine mammal sightings (event codes "S", "K", or "M"),
  marine mammal resights (codes "s", "k", "m"),
  marine mammal subgroup sightings (code "G"),
  marine mammal subgroup resights (code "g"),
  turtle sightings (code "t"),
  pinniped sightings (code "p"),
  and fishing vessel sightings (code "F").
  Warnings are printed if all S, K, M, and G events (and only these events) are not
  followed by an A event and at least one numeric event.
  See \code{\link{das_format_pdf}} for more information about events and event formats.
  Of specific note - sperm whale sightings (species code 046) often contain additional estimates
  recorded as "C" events immediately following the S, A, and numeric events.
  Because these estimates are recorded as"C" events, they are NOT included in the
  \code{das_sight} calculations or output for any \code{return.format}

  The \code{return.events} argument simply provides a shortcut for
  filtering the output of \code{das_sight} by event codes

  Abbreviations used in output column names: Gs = group size, Sp = species,
  Nm = nautical mile, Perc = percentage, Prob = probable,
  GsSchool = school-level group size info

  This function makes the following assumptions, and alterations to the raw DAS data:
  \itemize{
    \item "A" events immediately following an S/K/M/G event have
    the same sighting number (Data1 value) as the S/K/M/G event
    \item The 'nSp' column is equivalent to the number of non-\code{NA} values across the
      'Data5', 'Data6', 'Data7', and 'Data8' columns
      for the pertinent "A" event
    \item The following data are coerced to a numeric using
      \code{\link[base:numeric]{as.numeric}}:
      Bearing, Reticle, DistNm, Cue, Method,
      species percentages, and group sizes (including for t, p, and F events).
      Note that if there are any formatting errors and these data are not numeric,
      the function will likely print a warning message
    \item The values for the following columns are capitalized using
      \code{\link[base:chartr]{toupper}}:
      'Birds', 'Photos', 'CalibSchool', 'PhotosAerial', 'Biopsy',
      'TurtleAge', and 'TurtleCapt'
  }
}
\section{The "default" format output}{

  This output data frame contains 'long' sighting data, meaning there is one row for each species of each sighting event.
  The GsSp... columns are calculated as follows:
  for each species and for each observer estimate, the best/high/low school size estimate is multiplied by the applicable species percent estimate.
  The values are grouped by species and then averaged to get single GsSpBest, GsSpHigh, and GsSpLow values for each species.
  (using \code{\link[base]{mean}} with \code{na.rm = TRUE})

  Sighting information columns/formats present specifically in the "default" format output:
  \tabular{lll}{
    \emph{Sighting information} \tab \emph{Column name} \tab \emph{Notes}\cr
    Species code          \tab SpCode \tab Boat type or mammal, turtle, or pinniped species codes\cr
    Probable species code \tab SpCodeProb \tab Probable mammal species codes; \code{NA} if none or not applicable\cr
    Group size of species - best estimate \tab GsSpBest \tab
      The product of the arithmetic means of GsSchoolBest and the corresponding species percentage\cr
    Group size of species - high estimate \tab GsSpHigh \tab
      The product of the arithmetic means of GsSchoolHigh and the corresponding species percentage\cr
    Group size of species - low estimate \tab GsSpLow \tab
      The product of the arithmetic means of GsSchoolLow and the corresponding species percentage\cr
  }

  Note that for the above calculations,
  the GsSchoolX value and corresponding species percentages were each
  averaged across observers, using \code{na.rm = TRUE},
  before being multiplied to calculate GsSpX. For example, in the workflow:
  \code{GsSpBest1 = mean(.data$Data2, na.rm = TRUE) * mean(.data$Data5, na.rm = TRUE)}
}

\section{The "wide" and "complete" format outputs}{

  The "wide" and "complete" options have very similar columns in their output date frames.
  There are two main differences: 1) the "wide" format has one row for each sighting event,
  while the complete format has a row for every observer estimate for each sightings, and thus
  2) in the "wide" format, all numeric information for which there are multiple observer estimates
  (school group size, species percentage, etc.) are averaged across estimated via
  an arithmetic mean (using \code{\link[base]{mean}} with \code{na.rm = TRUE})

  With these formats, note that the species/type code and group size for
  turtle, pinniped, and boat sightings are in their own column

  Sighting information columns present in the "wide" and "complete" format outputs:
  \tabular{lll}{
    \emph{Sighting information}  \tab \emph{Column name}  \tab \emph{Notes}                            \cr
    Observer code - estimate     \tab ObsEstimate         \tab See below                               \cr
    Species 1 code               \tab SpCode1             \tab                                         \cr
    Species 2 code               \tab SpCode2             \tab                                         \cr
    Species 3 code               \tab SpCode3             \tab                                         \cr
    Species 4 code               \tab SpCode4             \tab                                         \cr
    Species 1 probable code      \tab SpCodeProb1         \tab Extracted from '?' event                \cr
    Species 2 probable code      \tab SpCodeProb2         \tab Extracted from '?' event                \cr
    Species 3 probable code      \tab SpCodeProb3         \tab Extracted from '?' event                \cr
    Species 4 probable code      \tab SpCodeProb4         \tab Extracted from '?' event                \cr
    Percentage of Sp 1 in school \tab SpPerc1             \tab                                         \cr
    Percentage of Sp 2 in school \tab SpPerc2             \tab                                         \cr
    Percentage of Sp 3 in school \tab SpPerc3             \tab                                         \cr
    Percentage of Sp 4 in school \tab SpPerc4             \tab                                         \cr
    Group size of species 1      \tab GsSpBest1           \tab Present in "wide" output only; see below\cr
    Group size of species 2      \tab GsSpBest2           \tab Present in "wide" output only; see below\cr
    Group size of species 3      \tab GsSpBest3           \tab Present in "wide" output only; see below\cr
    Group size of species 4      \tab GsSpBest4           \tab Present in "wide" output only; see below\cr
    Turtle species               \tab TurtleSp            \tab \code{NA} for non-"t" events            \cr
    Turtle group size            \tab TurtleGs            \tab \code{NA} for non-"t" events            \cr
    Was turtle captured?         \tab TurtleCapt          \tab \code{NA} for non-"t" events            \cr
    Pinniped species             \tab PinnipedSp          \tab \code{NA} for non-"p" events            \cr
    Pinniped group size          \tab PinnipedGs          \tab \code{NA} for non-"p" events            \cr
    Boat or gear type            \tab BoatType            \tab \code{NA} for non-"F" events            \cr
    Number of boats              \tab BoatGs              \tab \code{NA} for non-"F" events
  }

  ObsEstimate refers to the code of the observer that made the corresponding estimate.
  For the "wide" format, ObsEstimate is a list-column of all of the observer codes
  that provided an estimate.
  Also in the "wide" format, the GsSpBest# columns are the product of
  the means of GsSchoolBest and the corresponding species percentage
  (see the Default section for calculation details).
  These numbers, 1 to 4, correspond to the order of the data as it appears in the DAS file
}

\examples{
y <- system.file("das_sample.das", package = "swfscDAS")
y.proc <- das_process(y)

das_sight(y.proc)
das_sight(y.proc, return.format = "complete")

}
