---
title: "Introduction"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Introduction}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

# Regression analysis using the individual participant data (IPD)

```{r}
library(metasplines)
library(splines2)

# Estimate a linear regression model using an individual participant data (IPD):
res <- lm(
  Petal.Width ~
    Species +
    nsk(Sepal.Length, Boundary.knots = c(4.5, 7.5), knots = c(5, 6, 6.5)),
  data=iris)
summary(res)

# How is the estimated correlation matrix like?
cor.m <- cov2cor(vcov(res))
colnames(cor.m) <- NULL
round(cor.m, 3)
```

Note that the selection of the left boundary knot (in this example 4.5) should be the same as the one used on the estimates obtained from literature. This is because the reference point (where the value of the spline is set to zero for identifiability) in splines is generally the left boundary knot. The spline parameter values correspond to the spline function values at the internal (5, 6 and 6.5) and right boundary (7.5) knots, when the `nsk` splines are used. Other spline families are not supported at the moment.

# Pool the estimated spline parameters based on the IPD with literature-based (LB) estimates

```{r}
# "Literature-based" (LB) estimates:
lb.df <- read.table(text=
"variable,     cov.value,  est,  est.var
Sepal.Length,  4.5,	       0,     0   
Sepal.Length,  5,	         0.15,  0.01 
Sepal.Length,  5.5,	       0.25,  0.01 
Sepal.Length,  6,	         0.4,   0.01 
Sepal.Length,  6.5,	       0.5,   0.01 
Sepal.Length,  8,          0.25,  0.04 
", sep=",", header=TRUE)

lb.df

# Output table with the point estimates and the estimated variances:
pool_splines(v="Sepal.Length", meta.df=lb.df, glm.res=res)

# Full output containing also spline estimates and covariance matrix:
pooled <- pool_splines(v="Sepal.Length", meta.df=lb.df, glm.res=res, full.output = TRUE)
str(pooled)
```

The default output of the `pool_splines` function is a list with one element:

* `pooled` is a data frame containing the point estimates and the estimated variances of the spline parameters of the internal and right boundary knot. The IPD, LB and pooled estimates are included, and also the p-value of the Q heterogeneity test.

With the `full.output = TRUE` option, more output is included:

* `log.hr` contains the spline function estimates with 95% confidence interval limits at a grid of covariate values, ranging between the boundary knots. Also here the values are provided for the IPD, LB and pooled estimates.
* `meta.df` contains the LB estimates provided in the `meta.df` argument.
* `meta.vcov` contains the estimated covariance matrix of the pooled spline parameters.

# Plot the IPD, LB and pooled spline estimates

The LB estimates given in the `lb.df` data frame above are presented as green dots, and the corresponding 95% confidence intervals as vertical dashed lines. 

The IPD, LB and pooled point estimates of the spline functions are the smooth curves, and the corresponding dashed lines are the 95% confidence intervals. Note that the LB spline estimates are close to those estimates reported in `lb.df`, but in the points which do not equal the knots of the spline, there are small differences. Note also, that as the left boundary knot is at 4.5, the values of the splines are zero, and the widths of the confidence intervals are zero. 

Here we can see that the pooled spline estimate is between the IPD estimate and the LB estimate, and the confidence interval is narrower than the confidence intervals based on the IPD or literature.

```{r, fig.width=7, fig.height=7}
plot(0, 0, type="n", xlim=range(pooled$log.hr$cov.value), ylim=c(-0.2, 0.7),
     xlab="Covariate value", ylab="")
col.v <- rainbow(3)
names(col.v) <- unique(pooled$log.hr$model)
for (i in unique(pooled$log.hr$model)) {
  with(subset(pooled$log.hr, model==i), {
    lines(cov.value, est, col=col.v[i], lwd=2)
    lines(cov.value, ci.low, col=col.v[i], lty=2)
    lines(cov.value, ci.upp, col=col.v[i], lty=2)
  })
}
with(lb.df, {
  points(cov.value, est, pch=15, col=col.v["lb"])
  for (j in 2:nrow(lb.df))
    lines(rep(cov.value[j], 2), est[j] + 1.96 * c(-1, 1) * sqrt(est.var[j]),
          col=col.v["lb"], lty=2)
})
legend("bottomleft", legend=names(col.v), col=col.v, lwd=c(2,2,2))
```