--- title: "Simulated Grouped Hyper Data Frame" output: rmarkdown::html_vignette author: Tingting Zhan vignette: > %\VignetteIndexEntry{intro} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} library(knitr) opts_chunk$set( collapse = TRUE, comment = "#>" ) options(rmarkdown.html_vignette.check_title = FALSE) ``` # Introduction This vignette of package **`groupedHyperframe.random`** ([Github](https://github.com/tingtingzhan/groupedHyperframe.random), [RPubs](https://rpubs.com/tingtingzhan/groupedHyperframe_random)) documents the simulation of `superimpose`d `ppp.object` and the `groupedHyperframe` object. ## Note to Users Examples in this vignette require that the `search` path has ```{r setup} library(groupedHyperframe.random) ``` ## Terms and Abbreviations ```{r echo = FALSE, results = 'asis'} c( '', 'Forward pipe operator', '`?base::pipeOp` introduced in `R` 4.1.0', '`CRAN`, `R`', 'The Comprehensive R Archive Network', 'https://cran.r-project.org', '`coords`', '$x$- and $y$-coordinates', '`spatstat.geom:::ppp`', '`diag`', 'Diagonal matrix', '`base::diag`', '`groupedHyperframe`', 'Grouped hyper data frame', '`groupedHyperframe::as.groupedHyperframe`', '`hypercolumns`, `hyperframe`', '(Hyper columns of) hyper data frame', '`spatstat.geom::hyperframe`', '`marks`, `marked`', '(Having) mark values', '`spatstat.geom::is.marked`', '`pmax`', 'Parallel maxima', '`base::pmax`', '`ppp`, `ppp.object`', 'Point pattern', '`spatstat.geom::ppp.object`', '`recycle`', 'Recycling', 'https://r4ds.had.co.nz/vectors.html#scalars-and-recycling-rules', '`rlnorm`', 'Log normal random variable', '`stats::rlnorm`', '`rMatClust`', 'Matern\'s cluster process', '`spatstat.random::rMatClust`', '`rmvnorm_`', 'Multivariate normal random variable', '`groupedHyperframe.random::rmvnorm_`; `MASS::mvrnorm`', '`rnbinom`', 'Negative binomial random variable', '`stats::rnbinom`', '`rpoispp`', 'Poisson point pattern', '`spatstat.random::rpoispp`', # '`rStrauss`', 'Strauss process', '`spatstat.random::rStrauss`', '`superimpose`', 'Superimpose', '`spatstat.geom::superimpose`', '`var`, `cor`, `cov`', 'Variance, correlation, covariance', '`stats::var`, `stats::cor`, `stats::cov`' ) |> matrix(nrow = 3L, dimnames = list(c('Term / Abbreviation', 'Description', 'Reference'), NULL)) |> t.default() |> as.data.frame.matrix() |> kable() ``` ## Acknowledgement This work is supported by NCI R01CA222847 ([I. Chervoneva](https://orcid.org/0000-0002-9104-4505), [T. Zhan](https://orcid.org/0000-0001-9971-4844), and [H. Rui](https://orcid.org/0000-0002-8778-261X)) and R01CA253977 (H. Rui and I. Chervoneva). # Simulated Point Pattern Function `.rppp()` simulates `superimpose`d `ppp.object`s with *vectorized* parameterization of random point pattern and distribution of `marks`. ## Simulated un`marked` Point Pattern Example below simulates a `coords`-only, un`marked`, two `superimpose`d Matern's cluster processes $(\kappa, \mu, s) = (10,8,.15)$ and $(5,4,.06)$. ```{r} set.seed(125); r = .rppp(rMatClust(kappa = c(10, 5), mu = c(8, 4), scale = c(.15, .06))) # plot(r) # suppressed for aesthetics ``` ## Simulated `marked` Point Pattern Example below simulates two `superimpose`d `marked` `ppp`s, * Matern's cluster process $(\kappa,\mu,s) = (10,8,.15)$, attached with a log-normal `mark` $(\mu,\sigma)=(3,.4)$, and a negative-binomial `mark` $(r,p)=(4,.3)$. * Matern's cluster process $(\kappa,\mu,s) = (5,4,.06)$, attached with a log-normal `mark` $(\mu,\sigma)=(5,.2)$, and a negative-binomial `mark` $(r,p)=(4,.3)$. ```{r} set.seed(125); r1 = .rppp( rMatClust(kappa = c(10, 5), mu = c(8, 4), scale = c(.15, .06)), rlnorm(meanlog = c(3, 5), sdlog = c(.4, .2)), rnbinom(size = 4, prob = .3) # shorter parameter recycled ) ``` Example below simulates two `superimpose`d `marked` `ppp`s, * Poisson point pattern $\lambda=3$, attached with a log-normal `mark` $(\mu,\sigma)=(3,.4)$, and a negative-binomial `mark` $(r,p)=(4,.3)$. * Poisson point pattern $\lambda=6$, attached with a log-normal `mark` $(\mu,\sigma)=(5,.2)$, and a negative-binomial `mark` $(r,p)=(6,.1)$. ```{r} set.seed(62); r2 = .rppp( rpoispp(lambda = c(3, 6)), rlnorm(meanlog = c(3, 5), sdlog = c(.4, .2)), rnbinom(size = c(4, 6), prob = c(.3, .1)) ) ``` In the foreseeable future we will not support simulating more than one type of point patterns in a single call to function `.rppp()`. End user may manually `superimpose` different (`marked`) point patterns after simulating each of them separately. ```{r} spatstat.geom::superimpose(r1, r2) ``` # Simulated `groupedHyperframe` Now consider two `superimpose`d Matern's cluster processes attached with a log-normal `mark`. The population parameters are ```{r} (p = data.frame(kappa = c(3,2), scale = c(.4,.2), mu = c(10,5), meanlog = c(3,5), sdlog = c(.4,.2))) ``` We simulate for 3 subjects (e.g., patients). The subject-specific parameters deviate from the population parameters under a multivariate normal distribution with variance-covariance matrix $\Sigma$. The matrix $\Sigma$ may be specified by a `numeric` scalar, indicating all-equal `diag`onal `var`iances and zero `cor`relations/`cov`ariances. We also make sure that all subject-specific parameters satisfy that $\kappa>1$, $\mu>1$, $s>0$ for Matern's cluster processes, and $\sigma>0$ for log-normal distribution. Each `matrix` of the subject-specific parameters has the subjects on the rows, and the parameters of the `ppp`s to be `superimpose`d on the columns. ```{r} set.seed(39); (p. = rmvnorm_(n = 3L, mu = p, Sigma = list( kappa = .2^2, scale = .05^2, mu = .5^2, meanlog = .1^2, sdlog = .01^2)) |> within.list(expr = { kappa = pmax(kappa, 1 + .Machine$double.eps) mu = pmax(mu, 1 + .Machine$double.eps) scale = pmax(scale, .Machine$double.eps) sdlog = pmax(sdlog, .Machine$double.eps) })) ``` We simulate one to four `ppp`s (e.g., medical images) per subject. ```{r} set.seed(37); (n = sample.int(n = 4L, size = 3L, replace = TRUE)) ``` Function `grouped_rppp()` simulates a `groupedHyperframe` with a `ppp`-hypercolumn, and one-or-more columns of the grouping structure. ```{r} set.seed(76); (r = p. |> with.default(expr = { grouped_rppp( rMatClust(kappa = kappa, scale = scale, mu = mu), rlnorm(meanlog = meanlog, sdlog = sdlog), n = n ) })) ```