3. Structure of sftrack/sftraj objects

This is section will go more depth into the structure of sftrack/sftraj objects and how to work with them.

To begin, sftrack and sftraj objects are essentially data.frame objects with the 3 required columns (group, geometry, and time). However, they are also of the subclass sf. This allows them to act like sf objects when working with functions in the sf package but act as sftrack objects for all other actions. It should be noted that when possible sftrack objects will mimic sf functionality and thus many of the same words and tactics are used.

library("sftrack")

# Make tracks from raw data
data("raccoon", package = "sftrack")
#raccoon <- read.csv(system.file("extdata/raccoon_data.csv", package="sftrack"))
raccoon$month <- as.POSIXlt(raccoon$timestamp)$mon + 1

raccoon$time <- as.POSIXct(raccoon$timestamp, tz = "EST")
coords <- c("longitude","latitude")
group <- list(id = raccoon$animal_id, month = as.POSIXlt(raccoon$timestamp)$mon+1)
time <- "time"
error <- "fix"
crs <- 4326
# create a sftrack object
my_sftrack <- as_sftrack(data = raccoon, coords = coords, group = group, time = time, error = error, crs = crs)

# create a sftraj object
my_sftraj <- as_sftraj(data = raccoon, coords = coords, group = group, time = time, error = error, crs = crs)

In order for sftrack to function as an sf object, we create the data object as an sf object first (using st_as_sf()), and then add the sftrack attributes to the object. The class of an sftrack object is sftrack -> sf -> data.frame although the data.frame class is rarely called upon.

There are five attributes total to an sftrack object, two of these are created by sf (sf_column and agr), and the additional three are created by sftrack (group_col, time_col, and error_col).

attributes(my_sftrack)[-(1:2)]
## $class
## [1] "sftrack"    "sf"         "data.frame"
## 
## $sf_column
## [1] "geometry"
## 
## $agr
## animal_id  latitude longitude timestamp    height      hdop      vdop       fix 
##      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA> 
##     month      time sft_group 
##      <NA>      <NA>      <NA> 
## Levels: constant aggregate identity
## 
## $group_col
## [1] "sft_group"
## 
## $time_col
## [1] "time"
## 
## $error_col
## [1] "fix"

The sftrack level attributes are simply pointers to the data. Any attributes relevant to the grouping or geometry are stored in those columns themselves.

Geometry

The geometry column is an sfc object and contains the important spatial information for the track. As NA points are common and important in movement data, we create the sfc object with the option na.fail = TRUE.

my_sftrack$geometry
## Geometry set for 445 features  (with 168 geometries empty)
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -80.28149 ymin: 26.06761 xmax: -80.27046 ymax: 26.07706
## Geodetic CRS:  WGS 84
## First 5 geometries:
## POINT EMPTY
## POINT (-80.27906 26.06945)
## POINT EMPTY
## POINT EMPTY
## POINT (-80.27431 26.06769)

The sfc column varies in structure dependent on the movement class. An sftrack is a collection of POINTs while an sftraj is a GEOMETRY collection of POINTs and LINESTRINGs.

df1 <- data.frame(
    id = c(1, 1, 1, 1,1,1),
    month = c(1,1,1,1,1,1),
    x = c(27, 27, 27, NA,29,30),
    y = c(-80,-81,-82,NA, 83,83),
    timez = as.POSIXct("2020-01-01 12:00:00", tz = "UTC") + 60*60*(1:6)
)

test_sftraj <- as_sftraj(data = df1, group = list(id = df1$id, month = df1$month),
    time = df1$timez, active_group = c("id","month"), coords = df1[,c("x","y")])
test_sftraj$geometry
## Geometry set for 6 features  (with 1 geometry empty)
## Geometry type: GEOMETRY
## Dimension:     XY
## Bounding box:  xmin: 27 ymin: -82 xmax: 30 ymax: 83
## CRS:           NA
## First 5 geometries:
## LINESTRING (27 -80, 27 -81)
## LINESTRING (27 -81, 27 -82)
## POINT (27 -82)
## POINT EMPTY
## LINESTRING (29 83, 30 83)

Grouping

The grouping column is very specialized, and we will cover it in its own section. To begin, the novel attributes it stores are the active_group and the sort_index which is a factor of the current active groups. The grouping class consists of single row level group: s_groups (a.k.a single groups) and a column level collection of s_groups called a c_grouping (a.k.a column/collection grouping). This column acts as a robust storage column for the groupings and is maintained across a sftrack object.

attributes(my_sftrack$sft_group[1:10])
## $active_group
## [1] "id"    "month"
## 
## $sort_index
##  [1] TTP-058_1 TTP-058_1 TTP-058_1 TTP-058_1 TTP-058_1 TTP-058_1 TTP-058_1
##  [8] TTP-058_1 TTP-058_1 TTP-058_1
## Levels: TTP-058_1
## 
## $class
## [1] "c_grouping"
summary(my_sftrack)
##   animal_id            latitude       longitude     
##  Length:445         Min.   :26.07   Min.   :-80.28  
##  Class :character   1st Qu.:26.07   1st Qu.:-80.28  
##  Mode  :character   Median :26.07   Median :-80.28  
##                     Mean   :26.07   Mean   :-80.28  
##                     3rd Qu.:26.07   3rd Qu.:-80.28  
##                     Max.   :26.08   Max.   :-80.27  
##                     NA's   :168     NA's   :168     
##    timestamp                          height             hdop      
##  Min.   :2019-01-19 00:02:30.00   Min.   : -30.00   Min.   :0.000  
##  1st Qu.:2019-01-22 07:02:30.00   1st Qu.:   1.00   1st Qu.:0.000  
##  Median :2019-01-25 23:02:30.00   Median :   7.00   Median :1.300  
##  Mean   :2019-01-25 22:22:18.39   Mean   :  36.65   Mean   :1.691  
##  3rd Qu.:2019-01-29 07:02:09.00   3rd Qu.:  15.50   3rd Qu.:2.500  
##  Max.   :2019-02-01 23:02:30.00   Max.   :1107.00   Max.   :9.900  
##                                   NA's   :198                      
##       vdop       fix          month            time                       
##  Min.   :0.000   2D: 37   Min.   :1.000   Min.   :2019-01-19 00:02:30.00  
##  1st Qu.:0.000   3D:240   1st Qu.:1.000   1st Qu.:2019-01-22 07:02:30.00  
##  Median :1.900   NO:168   Median :1.000   Median :2019-01-25 23:02:30.00  
##  Mean   :1.938            Mean   :1.067   Mean   :2019-01-25 22:22:18.39  
##  3rd Qu.:3.200            3rd Qu.:1.000   3rd Qu.:2019-01-29 07:02:09.00  
##  Max.   :8.400            Max.   :2.000   Max.   :2019-02-01 23:02:30.00  
##                                                                           
##                    sft_group            geometry  
##  TTP-041_1              :208   POINT        :445  
##  TTP-041_2              : 15   epsg:4326    :  0  
##  TTP-058_1              :207   +proj=long...:  0  
##  TTP-058_2              : 15                      
##  active_group: id, month:  0                      
##                                                   
## 

Time

The time column must be either an integer or POSIXct and the column must be of one type of time. Beyond that there is not much specialized functionality in the column. Sftrack uses the time column to order outputs for analysis, and attempts to order outputs when originally making an sftrack object, however, the data.frame is not required to be ordered for analysis. A call to check_ordered() is called before analysis, and otherwise it is assumed the order does not matter. This is particularly true for a sftraj, where the geometry level contains information about t1 and t2.

Error

The error column is the column with the relevant error information for the spatial points in it. At present we have not built particular functionality but plan to in the future or reserve this for other developers to build upon.

Subsetting

An sftrack object acts like a data.frame and sf whenever appropriate. Because of this you can subset an sftrack object as you would a data.frame.

In this way row subsetting is very straight forward, as each row represents an individual point in time.

my_sftrack[1:10,]
## Sftrack with 10 features and 12 fields (4 empty geometries) 
## Geometry : "geometry" (XY, crs: WGS 84) 
## Timestamp : "time" (POSIXct in UTC) 
## Groupings : "sft_group" (*id*, *month*) 
## -------------------------------
##    animal_id latitude longitude           timestamp height hdop vdop fix month
## 1    TTP-058       NA        NA 2019-01-19 00:02:30     NA  0.0  0.0  NO     1
## 2    TTP-058 26.06945 -80.27906 2019-01-19 01:02:30      7  6.2  3.2  2D     1
## 3    TTP-058       NA        NA 2019-01-19 02:02:30     NA  0.0  0.0  NO     1
## 4    TTP-058       NA        NA 2019-01-19 03:02:30     NA  0.0  0.0  NO     1
## 5    TTP-058 26.06769 -80.27431 2019-01-19 04:02:30    858  5.1  3.2  2D     1
## 6    TTP-058 26.06867 -80.27930 2019-01-19 05:02:30    350  1.9  3.2  3D     1
## 7    TTP-058 26.06962 -80.27908 2019-01-19 06:02:30     11  2.3  4.5  3D     1
## 8    TTP-058 26.06963 -80.27902 2019-01-19 07:02:04      9  2.7  3.9  3D     1
## 9    TTP-058       NA        NA 2019-01-19 08:02:30     NA  0.0  0.0  NO     1
## 10   TTP-058 26.06982 -80.27900 2019-01-19 17:02:30     NA  2.0  3.3  3D     1
##                   time               sft_group                   geometry
## 1  2019-01-19 00:02:30 (id: TTP-058, month: 1)                POINT EMPTY
## 2  2019-01-19 01:02:30 (id: TTP-058, month: 1) POINT (-80.27906 26.06945)
## 3  2019-01-19 02:02:30 (id: TTP-058, month: 1)                POINT EMPTY
## 4  2019-01-19 03:02:30 (id: TTP-058, month: 1)                POINT EMPTY
## 5  2019-01-19 04:02:30 (id: TTP-058, month: 1) POINT (-80.27431 26.06769)
## 6  2019-01-19 05:02:30 (id: TTP-058, month: 1)  POINT (-80.2793 26.06867)
## 7  2019-01-19 06:02:30 (id: TTP-058, month: 1) POINT (-80.27908 26.06962)
## 8  2019-01-19 07:02:04 (id: TTP-058, month: 1) POINT (-80.27902 26.06963)
## 9  2019-01-19 08:02:30 (id: TTP-058, month: 1)                POINT EMPTY
## 10 2019-01-19 17:02:30 (id: TTP-058, month: 1)   POINT (-80.279 26.06982)

Unlike a data.frame, however, sftrack attempts to retain the geometry, group, and time columns, in order to maintain sftrack status. This is similar to how sf deals with the geometry column.

my_sftrack[1:3,c(1:3)]
## Sftrack with 3 features and 7 fields (2 empty geometries) 
## Geometry : "geometry" (XY, crs: WGS 84) 
## Timestamp : "time" (POSIXct in UTC) 
## Groupings : "sft_group" (*id*, *month*) 
## -------------------------------
##   animal_id latitude longitude               sft_group                time fix
## 1   TTP-058       NA        NA (id: TTP-058, month: 1) 2019-01-19 00:02:30  NO
## 2   TTP-058 26.06945 -80.27906 (id: TTP-058, month: 1) 2019-01-19 01:02:30  2D
## 3   TTP-058       NA        NA (id: TTP-058, month: 1) 2019-01-19 02:02:30  NO
##                     geometry
## 1                POINT EMPTY
## 2 POINT (-80.27906 26.06945)
## 3                POINT EMPTY

To turn off this feature, use the drop = TRUE argument which returns a data.frame object instead. If you would like to revert to an sf object, sf::st_sf(data) will return the object to an sf object.

my_sftrack[1:3,c(1:3), drop = TRUE]
##   animal_id latitude longitude
## 1   TTP-058       NA        NA
## 2   TTP-058 26.06945 -80.27906
## 3   TTP-058       NA        NA

sftrajs work nearly the same as sftracks, however because they are a step model where the steps are modeled as step1 (t1 ->t2) its important to note that subsetting will not automatically recalculate any new steps for you even if the original t2 point has been deleted.

If your subsetting will also change the end points for steps, then you can recalculate using step_recalc(). The output which is your original sftraj object but with the geometry column recalculated to the new t2s based on the timestamp. The results of which can be wildly different than the original subsetted data.frame. So be careful.

plot(my_sftraj, main = "Original")

new_traj <- my_sftraj[seq(10, nrow(my_sftraj), 10), ]

plot(new_traj, main = "Before recalculation")

plot(step_recalc(new_traj),  main = "After recalculation")

Some basic methods and functions of sftrack and sftraj objects

print

The print() layout is a modified version of the sf print function. It returns important info summarazing the sftrack object like the geometry information and burst information. The print function defaults to printing 1000 rows and displaying all the columns. This can be modified using the n_row and n_col arguments, which subset the printed output in the repsective axis. When using n_col the display will show the grouping geometry, and time fields as well as any other columns starting from column 1 until #columns + 3 = n_col. n_col and n_row are optional arguments, and if values are not supplied they default to the data.frame defaults. Note : n_row is not a corrolary to head(), as head() physically subsets the data while the n_row option just modifies the printed output.

If

print(my_sftrack, 5, 10)
## Sftrack with 445 features and 12 fields (168 empty geometries) 
## Geometry : "geometry" (XY, crs: WGS 84) 
## Timestamp : "time" (POSIXct in UTC) 
## Groupings : "sft_group" (*id*, *month*) 
## -------------------------------
##   animal_id latitude longitude           timestamp height hdop vdop fix month
## 1   TTP-058       NA        NA 2019-01-19 00:02:30     NA  0.0  0.0  NO     1
## 2   TTP-058 26.06945 -80.27906 2019-01-19 01:02:30      7  6.2  3.2  2D     1
## 3   TTP-058       NA        NA 2019-01-19 02:02:30     NA  0.0  0.0  NO     1
## 4   TTP-058       NA        NA 2019-01-19 03:02:30     NA  0.0  0.0  NO     1
## 5   TTP-058 26.06769 -80.27431 2019-01-19 04:02:30    858  5.1  3.2  2D     1
##                  time               sft_group                   geometry
## 1 2019-01-19 00:02:30 (id: TTP-058, month: 1)                POINT EMPTY
## 2 2019-01-19 01:02:30 (id: TTP-058, month: 1) POINT (-80.27906 26.06945)
## 3 2019-01-19 02:02:30 (id: TTP-058, month: 1)                POINT EMPTY
## 4 2019-01-19 03:02:30 (id: TTP-058, month: 1)                POINT EMPTY
## 5 2019-01-19 04:02:30 (id: TTP-058, month: 1) POINT (-80.27431 26.06769)

summary

summary() works as youd expect for a data.frame, except it displays the grouping column as a count of each active_group combination and the active_groupfor that column.

summary(my_sftrack)
##   animal_id            latitude       longitude     
##  Length:445         Min.   :26.07   Min.   :-80.28  
##  Class :character   1st Qu.:26.07   1st Qu.:-80.28  
##  Mode  :character   Median :26.07   Median :-80.28  
##                     Mean   :26.07   Mean   :-80.28  
##                     3rd Qu.:26.07   3rd Qu.:-80.28  
##                     Max.   :26.08   Max.   :-80.27  
##                     NA's   :168     NA's   :168     
##    timestamp                          height             hdop      
##  Min.   :2019-01-19 00:02:30.00   Min.   : -30.00   Min.   :0.000  
##  1st Qu.:2019-01-22 07:02:30.00   1st Qu.:   1.00   1st Qu.:0.000  
##  Median :2019-01-25 23:02:30.00   Median :   7.00   Median :1.300  
##  Mean   :2019-01-25 22:22:18.39   Mean   :  36.65   Mean   :1.691  
##  3rd Qu.:2019-01-29 07:02:09.00   3rd Qu.:  15.50   3rd Qu.:2.500  
##  Max.   :2019-02-01 23:02:30.00   Max.   :1107.00   Max.   :9.900  
##                                   NA's   :198                      
##       vdop       fix          month            time                       
##  Min.   :0.000   2D: 37   Min.   :1.000   Min.   :2019-01-19 00:02:30.00  
##  1st Qu.:0.000   3D:240   1st Qu.:1.000   1st Qu.:2019-01-22 07:02:30.00  
##  Median :1.900   NO:168   Median :1.000   Median :2019-01-25 23:02:30.00  
##  Mean   :1.938            Mean   :1.067   Mean   :2019-01-25 22:22:18.39  
##  3rd Qu.:3.200            3rd Qu.:1.000   3rd Qu.:2019-01-29 07:02:09.00  
##  Max.   :8.400            Max.   :2.000   Max.   :2019-02-01 23:02:30.00  
##                                                                           
##                    sft_group            geometry  
##  TTP-041_1              :208   POINT        :445  
##  TTP-041_2              : 15   epsg:4326    :  0  
##  TTP-058_1              :207   +proj=long...:  0  
##  TTP-058_2              : 15                      
##  active_group: id, month:  0                      
##                                                   
## 

summary_sftrack

summary_sftrack() is a special summary function specific for sftrack objects. It summarizes the data based on the beginning and end of each grouping as well as the total distance of the grouping. This function uses st_distance from the sf package and therefore outputs in units of the CRS. In this example the distance is in meters.

summary_sftrack(my_sftrack)
##       group points NAs          begin_time            end_time    length_m
## 1 TTP-041_1    208   0 2019-01-19 00:02:30 2019-01-31 23:02:30 10125.58779
## 2 TTP-041_2     15   0 2019-02-01 00:02:30 2019-02-01 23:02:07    32.28359
## 3 TTP-058_1    207   0 2019-01-19 00:02:30 2019-01-31 23:02:30 24724.31991
## 4 TTP-058_2     15   0 2019-02-01 00:02:30 2019-02-01 23:02:30  1927.07818

You can also trigger this function by using summary(data, stats = TRUE)

summary(my_sftrack, stats = TRUE)
##       group points NAs          begin_time            end_time    length_m
## 1 TTP-041_1    208   0 2019-01-19 00:02:30 2019-01-31 23:02:30 10125.58779
## 2 TTP-041_2     15   0 2019-02-01 00:02:30 2019-02-01 23:02:07    32.28359
## 3 TTP-058_1    207   0 2019-01-19 00:02:30 2019-01-31 23:02:30 24724.31991
## 4 TTP-058_2     15   0 2019-02-01 00:02:30 2019-02-01 23:02:30  1927.07818