SPLICE: A Synthetic Paid Loss and Incurred Cost Experience Simulator

This vignette aims to illustrate how the SPLICE package can be used to generate the case estimates of incurred losses of individual claims.

SPLICE (Synthetic Paid Loss and Incurred Cost Experience) is built on an existing simulator of paid claim experience called SynthETIC, which offers flexible modelling of occurrence, notification, as well as the timing and magnitude of individual partial payments (see the package documentation and vignette for a detailed example on how to use the package to simulate paid claims experience).

SPLICE enables the modelling of incurred loss estimates, via the following three modules:

Set Up

library(SPLICE)
#> Loading required package: SynthETIC
set.seed(20201006)
ref_claim <- return_parameters()[1] # 200,000
time_unit <- return_parameters()[2] # 0.25

For the definition and functionality of ref_claim and time_unit, we refer to the documentation of SynthETIC.

For this demo, we will start with the paid losses simulated by the example implementation of SynthETIC:

test_claims <- SynthETIC::test_claims_object

and simulate the case estimates of incurred losses of the 3624 indidual claims included in the claims object above.

0. Default Implementation

Sections 1-3 introduce the three modelling steps in detail and include extensive examples on how to replace the default implementation with sampling distributions deemed appropriate by the individual users of the program. For those who prefer to stick to the default assumptions, the following code is all that is required to generate the full incurred history:

# major revisions
major <- claim_majRev_freq(test_claims)
major <- claim_majRev_time(test_claims, major)
major <- claim_majRev_size(major)

# minor revisions
minor <- claim_minRev_freq(test_claims)
minor <- claim_minRev_time(test_claims, minor)
minor <- claim_minRev_size(test_claims, major, minor)

# development of case estimates
test <- claim_history(test_claims, major, minor)
test_inflated <- claim_history(test_claims, major, minor,
                               base_inflation_vector = rep((1 + 0.02)^(1/4) - 1, times = 80))

# transactional data
test_incurred_dataset_noInf <- generate_incurred_dataset(test_claims, test)
test_incurred_dataset_inflated <- generate_incurred_dataset(test_claims, test_inflated)

# incurred cumulative triangles
incurred_inflated <- output_incurred(test_inflated, incremental = FALSE)

1. Major Revisions

This section introduces a suite of functions that works together to simulate, in sequential order, the (1) frequency, (2) time, and (3) size of major revisions of incurred loss, for each of the claims occurring in each of the occurrence periods.

In particular, claim_majRev_freq() sets up the structure of the major revisions: a nested list such that the jth component of the ith sub-list is a list of information on major revisions of the jth claim of occurrence period i. The “unit list” (i.e. the smallest, innermost sub-list) contains the following components:

Name	Description
`majRev_freq`	Number of major revisions of incurred loss; see `claim_majRev_freq()`
`majRev_time`	Time of major revisions (from claim notification); see `claim_majRev_time()`
`majRev_factor`	Major revision multiplier of incurred loss; see `claim_majRev_size()`
`majRev_atP`	An indicator, `1` if the last major revision occurs at the time of the last major payment (i.e. second last payment), `0` otherwise; see `claim_majRev_time()`

1.1 Frequency of Major Revisions

claim_majRev_freq() generates the number of major revisions associated with a particular claim, from a user-defined random generation function. Users are free to choose any distribution (through the argument rfun), whether it be a pre-defined distribution in R, or more advanced ones from packages, or a proper user-defined function, to better match their own claim experience.

Let \(K\) represent the number of major revisions associated with a particular claim. The notification of a claim is considered as a major revision, so all claims have at least 1 major revision (\(K \ge 1\)).

Example 1.1.1 Zero-truncated Poisson distribution

One possible sampling distribution for this is the zero-truncated Poisson distribution from the actuar package.

SPLICE by default assumes the (removable) dependence of frequency of major revisions on claim size, which means that the user can specify the lambda parameter in actuar::rztpois as a paramfun (parameter function) of claim_size (and possibly more, see Example 1.1.2).

## paramfun input
# lambda as a function of claim size
no_majRev_param <- function(claim_size) {
  majRevNo_mean <- pmax(1, log(claim_size / 15000) - 2)
  c(lambda = majRevNo_mean)
}

## implementation and output
major_test <- claim_majRev_freq(
  test_claims, rfun = actuar::rztpois, paramfun = no_majRev_param)
# show the distribution of number of major revisions
table(unlist(major_test))
#> 
#>    1    2    3    4    5    6    7 
#> 1958 1084  418  124   32    5    3

Example 1.1.2 Additional dependencies

Like SynthETIC, users of SPLICE are able to add further dependencies in their simulation. This is illustrated in the example below.

Suppose we would like to add the additional dependence of claim_majRev_freq (number of major revisions) on the number of partial payments of the claim - which is not natively included in SPLICE default setting. For example, let’s consider the following parameter function:

## paramfun input
# an extended parameter function
majRevNo_param <- function(claim_size, no_payment) {
  majRevNo_mean <- pmax(0, log(claim_size / 1500000)) + no_payment / 10
  c(lambda = majRevNo_mean)
}

As this parameter function is dependent on no_payment, it should not come at a surprise that we need to supply the number of partial payments when calling claim_majRev_freq(). We need to make sure that the argument names are matched exactly (no_payment in this example) and that the input is specified as a vector of simulated quantities (not a list).

## implementation and output
no_payments_vect <- unlist(test_claims$no_payments_list)
# sample the frequency of major revisions from zero-truncated Poisson
# with parameters above
major_test <- claim_majRev_freq(
  test_claims, rfun = actuar::rztpois, paramfun = majRevNo_param,
  no_payment = no_payments_vect)
# show the distribution of number of major revisions
table(unlist(major_test))
#> 
#>    1    2    3    4    5    6    7 
#> 2818  621  146   28    7    3    1

Example 1.1.3 Default implementation

The default claim_majRev_freq() assumes that no additional major revisions will occur for claims of size smaller than or equal to a claim_size_benchmark. For claims above this threshold, a maximum of 3 major revisions can occur and the larger the claim size, the more likely there will be more major revisions.

There is no need to specify a sampling distribution if the user is happy with the default specification. This example is mainly to demonstrate how the default function works, and at the same time, to provide an example that one can modify to input a random sampling distribution of their choosing.

## input
# package default function for frequency of major revisions
dflt.majRev_freq_function <- function(
  n, claim_size, claim_size_benchmark = 0.075 * ref_claim) {
  
  # construct the range indicator
  test <- (claim_size > claim_size_benchmark)

  # if claim_size <= claim_size_benchmark
  # "small" claims assumed to have no major revisions except at notification
  no_majRev <- rep(1, n)
  # if claim_size is above the benchmark
  # probability of 2 major revisions, increases with claim size
  Pr2 <- 0.1 + 0.3 * 
    min(1, (claim_size[test] - 0.075 * ref_claim)/(0.925 * ref_claim))
  # probability of 3 major revisions, increases with claim size
  Pr3 <- 0.5 *
    min(1, max(0, claim_size[test] - 0.25 * ref_claim)/(0.75 * ref_claim))
  # probability of 1 major revision i.e. only one at claim notification
  Pr1 <- 1 - Pr2 - Pr3
  no_majRev[test] <- sample(
    c(1, 2, 3), size = sum(test), replace = T, prob = c(Pr1, Pr2, Pr3))
  
  no_majRev
}

Since the random function directly takes claim_size as an input, no additional parameterisation is required (unlike in Examples 1 and 2, where we first need a paramfun that turns the claim_size into the lambda parameter required in a zero-truncated Poisson distribution). Here we can simply run claim_majRev_freq() without inputting a paramfun.

## implementation and output
# simulate the number of major revisions
major <- claim_majRev_freq(
  claims = test_claims,
  rfun = dflt.majRev_freq_function
)

# show the distribution of number of major revisions
table(unlist(major))
#> 
#>    1    2    3 
#> 1877  309 1438

# view the major revision history of the first claim in the 1st occurrence period
# note that the time and size of the major revisions are yet to be generated
major[[1]][[1]]
#> $majRev_freq
#> [1] 3
#> 
#> $majRev_time
#> [1] NA
#> 
#> $majRev_factor
#> [1] NA
#> 
#> $majRev_atP
#> [1] NA

Note that SPLICE by default assumes the (removable) dependence of frequency of major revisions on claim size, hence there is no need to supply any additional arguments to claim_majRev_freq(), unlike in Example 1.1.2.

If one would like to keep the structure of the default sampling function but modify the benchmark value, they may do so via e.g.

major_test <- claim_majRev_freq(
  claims = test_claims,
  claim_size_benchmark = 30000
)

1.2 Time of Major Revisions

claim_majRev_time() generates the epochs of the major revisions (time measured from claim notification). It takes a very similar structure as claim_majRev_freq(), allowing users to input a sampling distribution via rfun and a parameter function which relates the parameter(s) of the distribution to selected claim characteristics.

Let \(\tau_k\) represent the epoch of the \(k\)th major revision (time measured from claim notification), \(k = 1, ..., K\). As the notification of a claim is considered a major revision itself, we have \(\tau_1 = 0\) for all claims.

Example 1.2.1 Modified uniform distribution

One simplistic option is to use a modified version of the uniform distribution (modified such that the first major revision always occurs at time 0 i.e. at claim notification).

majRev_time_paramfun in the example below specifies the min and max parameters for an individual claim as a function of setldel (settlement delay). Note that SPLICE by default assumes the (removable) dependence of timing of major revisions on claim size, settlement delay, and the partial payment times. Thanks to that, there is no need to supply any additional arguments to claim_majRev_time(). Users who wish to add further dependencies to the simulator can refer to Example 1.1.2.

## input
majRev_time_rfun <- function(n, min, max) {
  # n = number of major revisions of an individual claim
  majRev_time <- vector(length = n)
  majRev_time[1] <- 0 # first major revision at notification
  if (n > 1) {
    majRev_time[2:n] <- sort(stats::runif(n - 1, min, max))
  }
  
  return(majRev_time)
}
majRev_time_paramfun <- function(setldel, ...) {
  # setldel = settlement delay
  c(min = setldel/3, max = setldel)
}

## implementation and output
major_test <- claim_majRev_time(
  test_claims, major, rfun = majRev_time_rfun, paramfun = majRev_time_paramfun
)
major_test[[1]][[1]]
#> $majRev_freq
#> [1] 3
#> 
#> $majRev_time
#> [1]  0.00000 13.01127 14.13173
#> 
#> $majRev_factor
#> [1] NA
#> 
#> $majRev_atP
#> [1] 0

Example 1.2.2 Default implementation

The default implementation takes into account much complexity from the real-life claim process. It assumes that with a positive probability, the last major revision for a claim may coincide with the second last partial payment (which is usually the major settlement payment). In such cases, majRev_atP would be set to 1 indicating that there is a major revision simultaneous with the penultimate payment.

The epochs of the remaining major revisions are sampled from triangular distributions with maximum density at the earlier part of the claim’s lifetime.

## package default function for time of major revisions
dflt.majRev_time_function <- function(
  # n = number of major revisions
  # setldel = settlement delay
  # penultimate_delay = time from claim notification to second last payment
  n, claim_size, setldel, penultimate_delay) {

  majRev_time <- rep(NA, times = n)
  
  # first revision at notification
  majRev_time[1] <- 0
  if (n > 1) {
    # if the claim has multiple major revisions
    # the probability of having the last revision exactly at the second last partial payment
    p <- 0.2 *
      min(1, max(0, (claim_size - ref_claim) / (14 * ref_claim)))
    at_second_last_pmt <- sample(c(0, 1), size = 1, replace = TRUE, prob = c(1-p, p))
    
    # does the last revision occur at the second last partial payment?
    if (at_second_last_pmt == 0) {
      # -> no revision at second last payment
      majRev_time[2:n] <- sort(rtri(n - 1, min = setldel/3, max = setldel, mode = setldel/3))
    } else {
      # -> yes, revision at second last payment
      majRev_time[n] <- penultimate_delay
      if (n > 2) {
        majRev_time[2:(n-1)] <- sort(
          rtri(n - 2, min = majRev_time[n]/3, max = majRev_time[n], mode = majRev_time[n]/3))
      }
    }
  }
  majRev_time
}

Note that rtri is a function to generate random numbers from a triangular distribution that is included as part of the SPLICE package.

claim_size and setldel are both directly accessible claim characteristics, but we need paramfun to take care of the computation of penultimate_delay as a function of the partial payment delays that we can access.

dflt.majRev_time_paramfun <- function(payment_delays, ...) {
  c(penultimate_delay = sum(payment_delays[1:length(payment_delays) - 1]),
    ...)
}

## implementation and output
major <- claim_majRev_time(
  claims = test_claims,
  majRev_list = major, # we will update the previous major list
  rfun = dflt.majRev_time_function,
  paramfun = dflt.majRev_time_paramfun
)

# view the major revision history of the first claim in the 1st occurrence period
# observe that we have now updated the time of major revisions
major[[1]][[1]]
#> $majRev_freq
#> [1] 3
#> 
#> $majRev_time
#> [1]  0.000000  7.727115 14.203130
#> 
#> $majRev_factor
#> [1] NA
#> 
#> $majRev_atP
#> [1] 0

The above sampling distribution has been included as the default. There is no need to reproduce the above code if the user is happy with this default distribution. A simple equivalent to the above code is just

major <- claim_majRev_time(claims = test_claims, majRev_list = major)

This example is here only to demonstrate how the default function operates.

1.3 Size of Major Revisions

claim_majRev_size() generates the sizes of the major revisions. The major revision multipliers apply to the incurred loss estimates, that is, a revision multiplier of 2.54 means that at the time of the major revision the incurred loss increases by a factor of 2.54. We highlight this as in the case of minor revisions, the multipliers will instead apply to outstanding claim amounts, see claim_minRev_size().

The reason for this differentiation is that major revisions represent a total change of perspective on ultimate incurred cost, whereas minor revisions respond more to matters of detail, causing the case estimator to apply a revision factor to the estimate of outstanding payments.

Example 1.3.1 Gamma distribution

Suppose that we believe the major revision multipliers follow a gamma distribution with parameters dependent on the size of the claim. Then we can set up the simulation in the following way:

## input
majRev_size_rfun <- function(n, shape, rate) {
  # n = number of major revisions of an individual claim
  majRev_size <- vector(length = n)
  majRev_size[1] <- 1 # first major revision at notification
  if (n > 1) {
    majRev_size[2:n] <- stats::rgamma(n - 1, shape, rate)
  }
  
  majRev_size
}

majRev_size_paramfun <- function(claim_size) {
  shape <- max(log(claim_size / 5000), 1)
  rate <- 10 / shape
  c(shape = shape, rate = rate)
}

The default implementation of claim_majRev_size() assumes no further dependencies on claim characteristics. Hence we need to supply claim_size as an additional argument when running claim_majRev_size() when the above set up.

## implementation and output
claim_size_vect <- unlist(test_claims$claim_size_list)
major_test <- claim_majRev_size(
  majRev_list = major,
  rfun = majRev_size_rfun,
  paramfun = majRev_size_paramfun,
  claim_size = claim_size_vect
)

# view the major revision history of the first claim in the 1st occurrence period
# observe that we have now updated the size of major revisions
major_test[[1]][[1]]
#> $majRev_freq
#> [1] 3
#> 
#> $majRev_time
#> [1]  0.000000  7.727115 14.203130
#> 
#> $majRev_factor
#> [1] 1.0000000 0.9597788 2.0640283
#> 
#> $majRev_atP
#> [1] 0

Example 1.3.2 Default implementation

The default implementation samples the major revision multipliers from lognormal distributions:

## input
# package default function for sizes of major revisions
dflt.majRev_size_function <- function(n) {
  majRev_factor <- rep(NA, times = n)
  # set revision size = 1 for first revision (i.e. the one at notification)
  majRev_factor[1] <- 1
  if (n > 1) {
    # if the claim has multiple major revisions
    majRev_factor[2] <- stats::rlnorm(n = 1, meanlog = 1.8, sdlog = 0.2)
    if (n > 2) {
      # the last revision factor depends on what happened at the second major revision
      mu <- 1 + 0.07 * (6 - majRev_factor[2])
      majRev_factor[3] <- stats::rlnorm(n = 1, meanlog = mu, sdlog = 0.1)
    }
  }

  majRev_factor
}

## implementation and output
major <- claim_majRev_size(
  majRev_list = major,
  rfun = dflt.majRev_size_function
)

# view the major revision history of the first claim in the 1st occurrence period
# observe that we have now updated the size of major revisions
major[[1]][[1]]
#> $majRev_freq
#> [1] 3
#> 
#> $majRev_time
#> [1]  0.000000  7.727115 14.203130
#> 
#> $majRev_factor
#> [1] 1.000000 5.165349 2.619366
#> 
#> $majRev_atP
#> [1] 0

For this particular claim record, we observe 3 major revisions:

First one at claim notification with revision size \(g_1 =\) 1 (note that the notification of a claim is considered as a major revision, so all claims have at least 1 major revision);
Second one at delay \(\tau_2 =\) 7.7271155 from notification and has revision size of \(g_2 =\) 5.1653486 on the incurred loss;
Third one at delay \(\tau_3 =\) 14.2031298 from notification and has revision size of \(g_3 =\) 2.6193658 on the incurred loss. As commented in the paper, a claim may experience up to two major revisions in addition to the initial one, but the second, if it occurs at all, is likely to be smaller than the first.

2. Minor Revisions

Compared to the major revisions, the simulation of minor revisions may require slightly more complicated input specification, as we need to separate the case of minor revisions that occur simultaneously with a partial payment (minRev_atP) and the ones that do not.

Similar to the case of major revisions, the suite of functions under this heading run in sequential order to simulate the (1) frequency, (2) time, and (3) size of minor revisions of outstanding claim payments, for each of the claims occurring in each of the occurrence periods. In particular, claim_minRev_freq() sets up the structure of the minor revisions: a nested list such that the jth component of the ith sub-list is a list of information on minor revisions of the jth claim of occurrence period i. The “unit list” contains the following components:

Name	Description
`minRev_atP`	A logical vector indicating whether there is a minor revision at each partial payment; see `claim_minRev_freq()`
`minRev_freq_atP` (`minRev_freq_notatP`)	Number of minor revisions that occur (or do not occur) simultaneously with a partial payment. `minRev_freq_atP` is numerically equal to the sum of `minRev_atP`
`minRev_time_atP`, (`minRev_time_notatP`)	Time of minor revisions that occur (or do not occur) simultaneously with a partial payment (time measured from claim notification); see `claim_minRev_time()`
`minRev_factor_atP`, (`minRev_factor_notatP`)	Minor revision multiplier of outstanding claim payments for revisions at partial payments and at any other times, respectively; see `claim_minRev_size()`

2.1 Frequency of Minor Revisions

Minor revisions may occur simultaneously with a partial payment, or at any other time:

For the former case, we sample the occurrence of minor revisions as Bernoulli random variables with a probability parameter prob_atP (defaults to 1/2);
For the latter case, users have the option to specify an rfun_notatP for simulation and a paramfun_notatP to input the parameters for the sampling distribution, much in the same way as the analogous case of major revisions. The default implementation assumes a geometric distribution with mean = min(3, setldel / 4) and is illustrated below.

Example 2.1.1 Default implementation

## input
# package default function for frequency of minor revisions NOT at partial payments
dflt.minRev_freq_notatP_function <- function(n, setldel) {
  # setldel = settlement delay
  minRev_freq_notatP <- stats::rgeom(n, prob = 1 / (min(3, setldel/4) + 1))
  minRev_freq_notatP
}

## implementation and output
minor <- claim_minRev_freq(
  test_claims,
  prob_atP = 0.5,
  rfun_notatP = dflt.minRev_freq_notatP_function)

# view the minor revision history of the 10th claim in the 1st occurrence period
minor[[1]][[10]]
#> $minRev_atP
#> [1] 1 1 0 0 1
#> 
#> $minRev_freq_atP
#> [1] 3
#> 
#> $minRev_freq_notatP
#> [1] 1
#> 
#> $minRev_time_atP
#> [1] NA
#> 
#> $minRev_time_notatP
#> [1] NA
#> 
#> $minRev_factor_atP
#> [1] NA
#> 
#> $minRev_factor_notatP
#> [1] NA

An equivalent way of setting up the same structure using paramfun:

minRev_freq_notatP_paramfun <- function(setldel) {
  c(prob = 1 / (min(3, setldel/4) + 1))
}

minor <- claim_minRev_freq(
  test_claims,
  prob_atP = 0.5,
  rfun_notatP = stats::rgeom,
  paramfun_notatP = minRev_freq_notatP_paramfun)

Again the above example is only for illustrative purposes and users can run the default without explicitly spelling out the sampling distributions as above:

minor <- claim_minRev_freq(claims = test_claims)

Example 2.1.2 Alternative sampling distribution

Suppose we believe that there should be no minor revisions at partial payments (prob_atP = 0) and that the number of minor revisions should follow a geometric distribution but with a higher mean. SPLICE can easily account for these assumptions through the following code.

minRev_freq_notatP_paramfun <- function(setldel) {
  c(prob = 1 / (min(3, setldel/4) + 2))
}

minor_test <- claim_minRev_freq(
  test_claims,
  prob_atP = 0,
  rfun_notatP = stats::rgeom,
  paramfun_notatP = minRev_freq_notatP_paramfun)

minor_test[[1]][[10]]
#> $minRev_atP
#> [1] 0 0 0 0 0
#> 
#> $minRev_freq_atP
#> [1] 0
#> 
#> $minRev_freq_notatP
#> [1] 0
#> 
#> $minRev_time_atP
#> [1] NA
#> 
#> $minRev_time_notatP
#> [1] NA
#> 
#> $minRev_factor_atP
#> [1] NA
#> 
#> $minRev_factor_notatP
#> [1] NA

2.2 Time of Minor Revisions

claim_minRev_time() generates the epochs of the minor revisions (time measured from claim notification). Note that there is no need to specify a random sampling function for minor revisions that occur simultaneously with a partial payment because the revision times simply coincide with the epochs of the relevant partial payments.

For revisions outside of the partial payments, users are free to input a sampling distribution via rfun_notatP and a parameter function paramfun_notatP which relates the parameter(s) of the distribution to selected claim characteristics.

Example 2.2.1 Default implementation

By default we assume that the epochs of the minor revision can be sampled from a uniform distribution:

## input
# package default function for time of minor revisions that do not coincide with a payment
dflt.minRev_time_notatP <- function(n, setldel) {
  sort(stats::runif(n, min = setldel/6, max = setldel))
}

## implementation and output
minor <- claim_minRev_time(
  claims = test_claims,
  minRev_list = minor, # we will update the previous minor list
  rfun_notatP = dflt.minRev_time_notatP
)

# view the minor revision history of the 10th claim in the 1st occurrence period
# observe that we have now updated the time of minor revisions
minor[[1]][[10]]
#> $minRev_atP
#> [1] 1 1 0 0 1
#> 
#> $minRev_freq_atP
#> [1] 3
#> 
#> $minRev_freq_notatP
#> [1] 1
#> 
#> $minRev_time_atP
#> [1] 0.7911041 2.8190765 6.7787507
#> 
#> $minRev_time_notatP
#> [1] 5.835193
#> 
#> $minRev_factor_atP
#> [1] NA
#> 
#> $minRev_factor_notatP
#> [1] NA

Example 2.2.2 Alternative sampling distribution

Let’s consider an alternative example where we believe the epochs of minor revisions better follow a triangular distribution (see ?triangular from SPLICE). This can be set up as follows:

## input
minRev_time_notatP_rfun <- function(n, setldel) {
  # n = number of minor revisions
  # setldel = settlement delay
  sort(rtri(n, min = setldel/6, max = setldel, mode = setldel/6))
}

## implementation and output
minor_test <- claim_minRev_time(
  claims = test_claims,
  minRev_list = minor, # we will update the previous minor list
  rfun_notatP = minRev_time_notatP_rfun
)

# view the minor revision history of the 10th claim in the 1st occurrence period
# observe that we have now updated the time of minor revisions
minor_test[[1]][[10]]
#> $minRev_atP
#> [1] 1 1 0 0 1
#> 
#> $minRev_freq_atP
#> [1] 3
#> 
#> $minRev_freq_notatP
#> [1] 1
#> 
#> $minRev_time_atP
#> [1] 0.7911041 2.8190765 6.7787507
#> 
#> $minRev_time_notatP
#> [1] 3.702022
#> 
#> $minRev_factor_atP
#> [1] NA
#> 
#> $minRev_factor_notatP
#> [1] NA

2.3 Size of Minor Revisions

claim_minRev_size() generates the sizes of the minor revisions. Unlike the major revision multipliers which apply to the incurred loss estimates, the minor revision multipliers apply to the case estimate of outstanding claim payments i.e. a revision multiplier of 2.54 means that at the time of the minor revision the outstanding claims payment increases by a factor of 2.54. The reason for making this differentiation is briefly explained here.

SPLICE assumes a common sampling distribution for minor revisions that occur at partial payments and those that occur at any other times. But users may provide separate parameter functions (paramfun_atP and paramfun_notatP) for the two cases.

Example 2.3.1 Default implementation

In the default setting, we incorporate sampling dependence on the delay from notification to settlement, the delay from notification to the subject minor revisions, and the history of major revisions (in particular, the time of the second major revision).

Let \(\tau\) denote the delay from notification to the epoch of the minor revision, and \(w\) the settlement delay. Then

For \(\tau \le \frac{w}{3}\), the revision multiplier is sampled from a lognormal distribution with parameters meanlog = 0.15 and sdlog = 0.05 if preceded by a 2nd major revision, sdlog = 0.1 otherwise;
For \(\frac{w}{3} < \tau \le \frac{2w}{3}\), the revision multiplier is sampled from a lognormal distribution with parameters meanlog = 0 and sdlog = 0.05 if preceded by a 2nd major revision, sdlog = 0.1 otherwise;
For \(\tau > \frac{2w}{3}\), the revision multiplier is sampled from a lognormal distribution with parameters meanlog = -0.1 and sdlog = 0.05 if preceded by a 2nd major revision, sdlog = 0.1 otherwise.

Note that minor revisions tend to be upward in the early part of a claim’s life, and downward in the latter part.

## input
# package default function for the size of minor revisions
dflt.minRev_size <- function(
  # n = number of minor revisions
  # minRev_time = epochs of the minor revisions (from claim notification)
  # majRev_time_2nd = epoch of 2nd major revision (from claim notification)
  # setldel = settlement delay
  n, minRev_time, majRev_time_2nd, setldel) {

  k <- length(minRev_time)
  minRev_factor <- vector(length = k)

  if (k >= 1) {
    for (i in 1:k) {
      curr <- minRev_time[i]
      if (curr <= setldel/3) {
        meanlog <- 0.15
      } else if (curr <= (2/3) * setldel) {
        meanlog <- 0
      } else {
        meanlog <- -0.1
      }
      sdlog <- ifelse(curr > majRev_time_2nd, 0.05, 0.1)
      minRev_factor[i] <- stats::rlnorm(n = 1, meanlog, sdlog)
    }
  }

  minRev_factor
}

While setldel (settlement delay) is a directly accessible claim characteristic, we need paramfun to take care of the extraction and computation of majRev_time_2nd and minRev_time as a function of the revision lists that we can access.

# parameter function for minor revision at payments
minRev_size_param_atP <- function(major, minor, setldel) {
  list(minRev_time = minor$minRev_time_atP,
       majRev_time_2nd = ifelse(
         # so it always holds minRev_time < majRev_time_2nd
         is.na(major$majRev_time[2]), setldel + 1, major$majRev_time[2]),
       setldel = setldel)
}

# parameter function for minor revisions NOT at payments
minRev_size_param_notatP <- function(major, minor, setldel) {
  list(minRev_time = minor$minRev_time_notatP,
       majRev_time_2nd = ifelse(
         # so it always holds minRev_time < majRev_time_2nd
         is.na(major$majRev_time[2]), setldel + 1, major$majRev_time[2]),
       setldel = setldel)
}

## implementation and output
minor <- claim_minRev_size(
  claims = test_claims,
  majRev_list = major,
  minRev_list = minor,
  rfun = dflt.minRev_size,
  paramfun_atP = minRev_size_param_atP,
  paramfun_notatP = minRev_size_param_notatP
)

# view the minor revision history of the 10th claim in the 1st occurrence period
# observe that we have now updated the size of minor revisions
minor[[1]][[10]]
#> $minRev_atP
#> [1] 1 1 0 0 1
#> 
#> $minRev_freq_atP
#> [1] 3
#> 
#> $minRev_freq_notatP
#> [1] 1
#> 
#> $minRev_time_atP
#> [1] 0.7911041 2.8190765 6.7787507
#> 
#> $minRev_time_notatP
#> [1] 5.835193
#> 
#> $minRev_factor_atP
#> [1] 0.9613874 0.9873257 0.9366064
#> 
#> $minRev_factor_notatP
#> [1] 0.9107412

For this particular claim record, we observe 3 minor revisions that coincide with a payment and 1 minor revisions outside of the partial payment times.

Example 2.3.2 Uniform distribution for minor revision multipliers

For illustrative purposes, let’s now assume that the minor revision multipliers should be sampled from a uniform distribution.

## input
paramfun_atP <- function(claim_size, ...) {
  c(min = pmin(1, pmax(log(claim_size / 15000), 0.5)),
    max = pmin(1, pmax(log(claim_size / 15000), 0.5)) + 1)
}
paramfun_notatP <- paramfun_atP

## implementation and output
claim_size_vect <- unlist(test_claims$claim_size_list)
minor_test <- claim_minRev_size(
  test_claims, major, minor,
  rfun = stats::runif, paramfun_atP, paramfun_notatP,
  claim_size = claim_size_vect)
minor_test[[1]][[10]]
#> $minRev_atP
#> [1] 1 1 0 0 1
#> 
#> $minRev_freq_atP
#> [1] 3
#> 
#> $minRev_freq_notatP
#> [1] 1
#> 
#> $minRev_time_atP
#> [1] 0.7911041 2.8190765 6.7787507
#> 
#> $minRev_time_notatP
#> [1] 5.835193
#> 
#> $minRev_factor_atP
#> [1] 0.8357490 1.3070304 0.8162695
#> 
#> $minRev_factor_notatP
#> [1] 1.487637

3. Development of Case Estimates

This section requires no additional input specification from the program user (except the quarterly inflation rates - which should match with what was used in SynthETIC::claim_payment_inflation() when generating the inflated amount of partial payments) and simply consolidates the partial payments and the incurred revisions generated above, subject to some additional revision constraints (?claim_history for details). The end product is a full transactional history of the case estimates of the individual claims over its lifetime.

We can choose to exclude (default) or include adjustment for inflation:

Implementation and Output

# exclude inflation (by default)
result <- claim_history(test_claims, major, minor)
# include inflation
result_inflated <- claim_history(
  test_claims, major, minor, 
  base_inflation_vector = rep((1 + 0.02)^(1/4) - 1, times = 80))

Observe how the results differ between the case estimates with/without inflation:

data <- generate_incurred_dataset(test_claims_object, result)
str(data)
#> 'data.frame':    32384 obs. of  9 variables:
#>  $ claim_no  : int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ claim_size: num  785871 785871 785871 785871 785871 ...
#>  $ txn_time  : num  0.689 3.949 4.198 6.327 7.096 ...
#>  $ txn_delay : num  0 3.26 3.51 5.64 6.41 ...
#>  $ txn_type  : chr  "Ma" "Mi" "P" "Mi" ...
#>  $ incurred  : num  71767 76242 76242 78597 78597 ...
#>  $ OCL       : num  71767 76242 51137 53493 27316 ...
#>  $ cumpaid   : num  0.00 6.55e-11 2.51e+04 2.51e+04 5.13e+04 ...
#>  $ multiplier: num  1 1.06 NA 1.05 NA ...
head(data, n = 9)
#>   claim_no claim_size   txn_time txn_delay txn_type  incurred       OCL
#> 1        1   785870.8  0.6889986  0.000000       Ma  71766.93  71766.93
#> 2        1   785870.8  3.9493565  3.260358       Mi  76241.74  76241.74
#> 3        1   785870.8  4.1975938  3.508595        P  76241.74  51136.97
#> 4        1   785870.8  6.3271693  5.638171       Mi  78597.46  53492.68
#> 5        1   785870.8  7.0960120  6.407013        P  78597.46  27316.06
#> 6        1   785870.8  8.4161141  7.727115       Ma 405983.27 354701.87
#> 7        1   785870.8 11.1576971 10.468698      PMi 422014.90 344400.32
#> 8        1   785870.8 14.4457620 13.756763      PMi 379360.08 275404.40
#> 9        1   785870.8 14.8921284 14.203130       Ma 993682.82 889727.14
#>        cumpaid multiplier
#> 1 0.000000e+00  1.0000000
#> 2 6.548362e-11  1.0623521
#> 3 2.510478e+04         NA
#> 4 2.510478e+04  1.0460667
#> 5 5.128140e+04         NA
#> 6 5.128140e+04  5.1653486
#> 7 7.761458e+04  1.0451975
#> 8 1.039557e+05  0.8761476
#> 9 1.039557e+05  2.6193658

data_inflated <- generate_incurred_dataset(test_claims_object, result_inflated)
str(data_inflated)
#> 'data.frame':    32384 obs. of  9 variables:
#>  $ claim_no  : int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ claim_size: num  785871 785871 785871 785871 785871 ...
#>  $ txn_time  : num  0.689 3.949 4.198 6.327 7.096 ...
#>  $ txn_delay : num  0 3.26 3.51 5.64 6.41 ...
#>  $ txn_type  : chr  "Ma" "Mi" "P" "Mi" ...
#>  $ incurred  : num  71514 77210 77210 80528 80528 ...
#>  $ OCL       : num  71514 77210 51578 54896 27784 ...
#>  $ cumpaid   : num  0.00 2.18e-11 2.56e+04 2.56e+04 5.27e+04 ...
#>  $ multiplier: num  1 1.06 NA 1.05 NA ...
head(data_inflated, n = 9)
#>   claim_no claim_size   txn_time txn_delay txn_type   incurred       OCL
#> 1        1   785870.8  0.6889986  0.000000       Ma   71514.42  71514.42
#> 2        1   785870.8  3.9493565  3.260358       Mi   77209.73  77209.73
#> 3        1   785870.8  4.1975938  3.508595        P   77209.73  51577.79
#> 4        1   785870.8  6.3271693  5.638171       Mi   80528.15  54896.21
#> 5        1   785870.8  7.0960120  6.407013        P   80528.15  27783.67
#> 6        1   785870.8  8.4161141  7.727115       Ma  420279.94 367535.46
#> 7        1   785870.8 11.1576971 10.468698      PMi  442861.81 362288.63
#> 8        1   785870.8 14.4457620 13.756763      PMi  404523.03 295655.95
#> 9        1   785870.8 14.8921284 14.203130       Ma 1061937.89 953070.80
#>        cumpaid multiplier
#> 1 0.000000e+00  1.0000000
#> 2 2.182787e-11  1.0623521
#> 3 2.563194e+04         NA
#> 4 2.563194e+04  1.0460667
#> 5 5.274448e+04         NA
#> 6 5.274448e+04  5.1653486
#> 7 8.057318e+04  1.0451975
#> 8 1.088671e+05  0.8761476
#> 9 1.088671e+05  2.6193658

Note that the above data and data_inflated datasets are included as part of the package as test_incurred_dataset_noInf and test_incurred_dataset_inflated:

str(test_incurred_dataset_noInf)
#> 'data.frame':    31250 obs. of  9 variables:
#>  $ claim_no  : int  1 1 1 1 1 1 1 1 1 2 ...
#>  $ claim_size: num  785871 785871 785871 785871 785871 ...
#>  $ txn_time  : num  0.689 4.198 7.096 8.554 11.158 ...
#>  $ txn_delay : num  0 3.51 6.41 7.86 10.47 ...
#>  $ txn_type  : chr  "Ma" "P" "P" "Ma" ...
#>  $ incurred  : num  64033 64033 64033 339643 345196 ...
#>  $ OCL       : num  64033 38928 12752 288361 267581 ...
#>  $ cumpaid   : num  0 25105 51281 51281 77615 ...
#>  $ multiplier: num  1 NA NA 5.3 1.02 ...
str(test_incurred_dataset_inflated)
#> 'data.frame':    31250 obs. of  9 variables:
#>  $ claim_no  : int  1 1 1 1 1 1 1 1 1 2 ...
#>  $ claim_size: num  785871 785871 785871 785871 785871 ...
#>  $ txn_time  : num  0.689 4.198 7.096 8.554 11.158 ...
#>  $ txn_delay : num  0 3.51 6.41 7.86 10.47 ...
#>  $ txn_type  : chr  "Ma" "P" "P" "Ma" ...
#>  $ incurred  : num  63905 63905 63905 352421 362839 ...
#>  $ OCL       : num  63905 38273 11160 299677 282266 ...
#>  $ cumpaid   : num  0 25632 52744 52744 80573 ...
#>  $ multiplier: num  1 NA NA 5.3 1.02 ...

Output: Chain-Ladder Incurred Triangles

SPLICE also provides an option to produce the incurred triangles aggregated by accident and development periods:

square_inc <- output_incurred(result)
square_cum <- output_incurred(result, incremental = F)
square_inflated_inc <- output_incurred(result_inflated)
square_inflated_cum <- output_incurred(result_inflated, incremental = F)

yearly_inc <- output_incurred(result, aggregate_level = 4)
yearly_cum <- output_incurred(result, aggregate_level = 4, incremental = F)
yearly_cum
#>           DP1      DP2      DP3      DP4      DP5      DP6      DP7      DP8
#> AP1  21846312 37758611 49627375 55794329 58767892 62366936 62902546 62486029
#> AP2  17989801 37594432 48227626 50816309 55323643 55641507 55534306 57396402
#> AP3  18299898 38378501 46407079 53043975 55787454 57065817 57598917 57953386
#> AP4  17313124 34289765 44952334 51129516 53734088 54468306 55962256 56407511
#> AP5  20511783 42211689 54327913 60431132 62285147 61654702 62794043 63144648
#> AP6  21436730 39448518 47811984 53766037 57768597 58088457 58712026 58376682
#> AP7  20186643 35172249 44256497 48936876 54586154 55642298 54895552 55403589
#> AP8  13077553 29871334 42877233 45884245 47782393 49904558 49836704 49896893
#> AP9  19941842 41091858 50342713 56461572 56659111 55898789 56698061 57343630
#> AP10 19402526 40737328 52180283 57487056 56801187 55745586 55940701 56276123
#>           DP9     DP10
#> AP1  61739214 61995953
#> AP2  59477258 59229413
#> AP3  57966421 57805433
#> AP4  56206775 55752961
#> AP5  62825138 62572956
#> AP6  58348050 58322971
#> AP7  55341593 55297816
#> AP8  49803452 49725363
#> AP9  57979184 57712377
#> AP10 56238590 56093866

We can also set future = FALSE to hide the future triangle and perform a chain-ladder analysis:

# output the past cumulative triangle
cumtri <- output_incurred(result, aggregate_level = 4, 
                          incremental = FALSE, future = FALSE)
# calculate the age to age factors
selected <- vector()
J <- nrow(cumtri)
for (i in 1:(J - 1)) {
  # use volume weighted age to age factors
  selected[i] <- sum(cumtri[, (i + 1)], na.rm = TRUE) / sum(cumtri[1:(J - i), i], na.rm = TRUE)
}
# complete the triangle
CL_prediction <- cumtri
for (i in 2:J) {
  for (j in (J - i + 2):J) {
    CL_prediction[i, j] <- CL_prediction[i, j - 1] * selected[j - 1]
  }
}

CL_prediction
#>           DP1      DP2      DP3      DP4      DP5      DP6      DP7      DP8
#> AP1  21846312 37758611 49627375 55794329 58767892 62366936 62902546 62486029
#> AP2  17989801 37594432 48227626 50816309 55323643 55641507 55534306 57396402
#> AP3  18299898 38378501 46407079 53043975 55787454 57065817 57598917 57953386
#> AP4  17313124 34289765 44952334 51129516 53734088 54468306 55962256 56534496
#> AP5  20511783 42211689 54327913 60431132 62285147 61654702 62314234 62951426
#> AP6  21436730 39448518 47811984 53766037 57768597 58839322 59468737 60076832
#> AP7  20186643 35172249 44256497 48936876 51750611 52709794 53273640 53818388
#> AP8  13077553 29871334 42877233 47771337 50518056 51454394 52004811 52536584
#> AP9  19941842 41091858 52770452 58793790 62174269 63326652 64004069 64658540
#> AP10 19402526 38192007 49046443 54644714 57786634 58857693 59487304 60095590
#>           DP9     DP10
#> AP1  61739214 61995953
#> AP2  59477258 59724590
#> AP3  58598287 58841964
#> AP4  57163608 57401319
#> AP5  63651944 63916636
#> AP6  60745363 60997968
#> AP7  54417275 54643565
#> AP8  53121207 53342108
#> AP9  65378055 65649926
#> AP10 60764328 61017013