Help for package WpProj

Type:

Package

Title:

Linear p-Wasserstein Projections

Version:

0.2.3

Date:

2025-02-03

Description:

Performs Wasserstein projections from the predictive distributions of any model into the space of predictive distributions of linear models. We utilize L1 penalties to also reduce the complexity of the model space. This package employs the methods as described in Dunipace, Eric and Lorenzo Trippa (2020) <doi:10.48550/arXiv.2012.09999>.

License:

GPL (== 3.0)

Depends:

R (≥ 4.0)

Imports:

approxOT (≥ 1.2), glmnet, oem, Rcpp, rlang, ROI, ROI.plugin.ecos, ROI.plugin.lpsolve, Matrix, rqPen, quantreg, doParallel, foreach, doRNG, dplyr, stats, magrittr, methods, slam, lifecycle

LinkingTo:

approxOT (≥ 1.2), BH, Rcpp (≥ 1.0.0), RcppCGAL, RcppEigen, RcppProgress, RSpectra

Suggests:

ggplot2, ggsci, ggridges, testthat (≥ 2.1.0), transport, Rmosek, spelling, ECOSolveR

RoxygenNote:

7.3.2

URL:

https://github.com/ericdunipace/WpProj

BugReports:

https://github.com/ericdunipace/WpProj/issues

SystemRequirements:

C++17

Encoding:

UTF-8

Language:

en-US

NeedsCompilation:

yes

Packaged:

2025-02-03 23:43:13 UTC; eifer

Author:

Eric Dunipace

[aut, cre], Clemens Schmid

[ctb] (ETA progres bar is adapted from their code), Espen Bernton [ctb] ('Hilbert Sort' adapted from their code), Mathieu Gerber [ctb] ('Hilbert Sort' adapted from their code), Pierre Jacob [ctb] ('Hilbert Sort' adapted from their code), Bin Dai [ctb] (W2 projections adapted from their 'OEM' code), Jared Huling

[ctb] (W2 projections adapted from their 'OEM' code), Yixuan Qiu [ctb] (W2 projections adapted from their 'OEM' code), Dominic Schuhmacher [ctb] ('Shortsimplex 'optimal transport method adapted from their code), Nicolas Bonneel [ctb] ('Network Simplex' algorithm adapted from their code)

Maintainer:

Eric Dunipace <edunipace@mail.harvard.edu>

Repository:

CRAN

Date/Publication:

2025-02-05 18:20:02 UTC

WpProj: Linear p-Wasserstein Projections

Description

Author(s)

Maintainer: Eric Dunipace edunipace@mail.harvard.edu (ORCID)

Other contributors:

Clemens Schmid clemens@nevrome.de (ORCID) (ETA progres bar is adapted from their code) [contributor]
Espen Bernton ('Hilbert Sort' adapted from their code) [contributor]
Mathieu Gerber ('Hilbert Sort' adapted from their code) [contributor]
Pierre Jacob ('Hilbert Sort' adapted from their code) [contributor]
Bin Dai bdai@uwalumni.com (W2 projections adapted from their 'OEM' code) [contributor]
Jared Huling jaredhuling@gmail.com (ORCID) (W2 projections adapted from their 'OEM' code) [contributor]
Yixuan Qiu (W2 projections adapted from their 'OEM' code) [contributor]
Dominic Schuhmacher dominic.schuhmacher@mathematik.uni-goettingen.de ('Shortsimplex 'optimal transport method adapted from their code) [contributor]
Nicolas Bonneel ('Network Simplex' algorithm adapted from their code) [contributor]

Run the Hahn-Carvalho Method

Description

Runs the Hahn-Carvalho method but adapted to return full distributions.

Usage

HC(
  X,
  Y = NULL,
  theta,
  family = "gaussian",
  penalty = c("elastic.net", "selection.lasso", "lasso", "ols", "mcp", "scad", "mcp.net",
    "scad.net", "grp.lasso", "grp.lasso.net", "grp.mcp", "grp.scad", "grp.mcp.net",
    "grp.scad.net", "sparse.grp.lasso"),
  method = c("selection.variable", "projection"),
  lambda = numeric(0),
  nlambda = 100L,
  lambda.min.ratio = NULL,
  alpha = 1,
  gamma = 1,
  tau = 0.5,
  groups = numeric(0),
  penalty.factor = NULL,
  group.weights = NULL,
  maxit = 500L,
  tol = 1e-07,
  irls.maxit = 100L,
  irls.tol = 0.001
)

Arguments

X

Covariates

Y

Predictions

theta

Parameters

family

Family for method. See oem.

penalty

Penalty function. See oem.

method

Should we run a selection variable methodology or projection?

lambda

lambda for lasso. See oem for this and all options below

nlambda

Number of lambda values.

lambda.min.ratio

Minimum lambda ratio for self selected lambda

alpha

elastic net mixing.

gamma

tuning parameters for SCAD and MCP

tau

mixing parameter for sparse group lasso

groups

A vector of grouping values

penalty.factor

Penalty factor for OEM.

group.weights

Weights for groupped lasso

maxit

Max iteration for OEM

tol

Tolerance for OEM

irls.maxit

IRLS max iterations for OEM

irls.tol

IRLS tolerance for OEM

Value

a WpProj object with selected covariates and their values

References

Hahn, P. Richard and Carlos M. Carvalho. (2014) "Decoupling Shrinkage and Selection in Bayesian Linear Models: A Posterior Summary Perspective." https://arxiv.org/pdf/1408.0464

Examples

n <- 32
p <- 10
s <- 99
x <- matrix( 1, nrow = n, ncol = p )
beta <- (1:10)/10
y <- x %*% beta
post_beta <- matrix(beta, nrow=p, ncol=s) 
post_mu <- x %*% post_beta

fit <-  HC(X=x, Y=post_mu, theta = post_beta,
               penalty = "lasso", 
               method = "projection"
)

Options For Use With the L0 Method

Description

Options For Use With the L0 Method

Usage

L0_method_options(
  method = c("binary program", "projection"),
  transport.method = transport_options(),
  epsilon = 0.05,
  OTmaxit = 0,
  parallel = NULL,
  ...
)

Arguments

method

Should covariates be selected as an approximate "binary program" or should a projection method be used. Default is the approximate binary program.

transport.method

Method for Wasserstein distance calculation. Should be one the outputs of transport_options().

epsilon

A value > 0 for the penalty parameter if using the Sinkhorn method for optimal transport

OTmaxit

The number of iterations to run the Wasserstein distance solvers.

parallel

A cluster backend to be used by foreach::foreach() if parallelization is desired.

...

Not used

Value

a named list corresponding to the above arguments

Examples

L0_method_options()

Options For Use With the L1 Method

Description

Options For Use With the L1 Method

Usage

L1_method_options(
  penalty = L1_penalty_options(),
  lambda = numeric(0),
  nlambda = 500L,
  lambda.min.ratio = 1e-04,
  gamma = 1,
  alpha = 1,
  maxit = 500L,
  model.size = NULL,
  tol = 1e-07,
  display.progress = FALSE,
  solver.options = NULL
)

Arguments

penalty

The penalty to use. See L1_penalty_options() for more details.

lambda

The penalty parameter to use if method is "L1".

nlambda

The number of lambdas to explore for the "L1" method if lambda is not provided

lambda.min.ratio

The minimum ratio of max to min lambda for "L1" method. Default 1e-4.

gamma

Tuning parameter for SCAD and MCP penalties if method = "L1". ⁠>=1⁠

alpha

Tuning parameter for elastic net penalties alpha should be in ⁠[0,1]⁠.

maxit

The maximum iterations for optimization. Default is 500.

model.size

What is the maximum number of coefficients to have in the final model. Default is NULL. If NULL, will find models from the minimum size, 0, to the number of columns in X.

tol

The tolerance for convergence

display.progress

Logical. Should intermediate progress be displayed? TRUE or FALSE. Default is FALSE.

solver.options

Options to be passed on to the solver. Only used for "ecos" and "mosek" solvers.

Value

A list with names corresponding to each argument above.

Examples

L1_method_options()

Recognized L1 Penalties

Description

Recognized L1 Penalties

Usage

L1_penalty_options()

Value

A character vector with the possible penalties for L1 methods

Examples

L1_penalty_options()
# [1] "lasso"            "ols"              "mcp"              "elastic.net"      "scad"            
# [6] "mcp.net"          "scad.net"         "grp.lasso"        "grp.lasso.net"    "grp.mcp"         
# [11] "grp.scad"         "grp.mcp.net"      "grp.scad.net"     "sparse.grp.lasso"

1-Wasserstein projection

Description

1-Wasserstein projection

Usage

W1L1(
  X,
  Y,
  theta = NULL,
  penalty = c("none", "lasso", "scad", "mcp"),
  model.size = NULL,
  lambda = numeric(0),
  lambda.min.ratio = 1e-04,
  nlambda = 10,
  gamma = 1,
  display.progress = FALSE,
  solver = c("cone", "rqPen", "gurobi", "mosek"),
  ...
)

Arguments

X

Covariates

Y

Predictions from arbitrary model

theta

Parameters of original linear model. Optional.

penalty

penalty term to use. One of "none", "lasso","scad","mcp"

model.size

Maximum number of coefficients in interpretable model

lambda

Lambdas to use

lambda.min.ratio

Minimum lambda to select if choosing lambdas using default methods

nlambda

number of lambdas to look through

gamma

parameter for SCAD and MCP methods

display.progress

Print intermediate output?

solver

Solver to use. Must be one of "rqPen", "gurobi", "mosek", though "mosek" is preferred.

...

options to pass to solvers

Value

WpProj object

2-Wasserstein distance selection by Integer Programming

Description

2-Wasserstein distance selection by Integer Programming

Usage

W2IP(
  X,
  Y = NULL,
  theta,
  transport.method = transport_options(),
  model.size = NULL,
  nvars = NULL,
  maxit = 100L,
  infimum.maxit = 100L,
  tol = 1e-07,
  solver = c("cone", "lp", "mosek", "cplex", "gurobi"),
  display.progress = FALSE,
  parallel = NULL,
  ...
)

Arguments

X

Covariates

Y

Predictions from arbitrary model

theta

Parameters of original linear model. Required

transport.method

Method for Wasserstein distance calculation. Should be one of the outputs of transport_options().

model.size

Maximum number of coefficients in interpretable model

nvars

The number of variables to explore. Should be an integer vector of model sizes. Default is NULL which will explore all models from 1 to model.size.

maxit

Maximum number of solver iterations

infimum.maxit

Maximum iterations to alternate binary program and Wasserstein distance calculation

tol

Tolerance for convergence of coefficients

solver

The solver to use. Must be one of "cone","lp", "cplex", "gurobi","mosek".

display.progress

Should progress be printed?

parallel

foreach back end. See foreach::foreach() for more details.

...

Extra args to Wasserstein distance methods

Details

For argument solution.method, options "cone" and "lp" use the free solvers "ECOS" and "lpSolver", respectively. "cplex", "gurobi" and "mosek" require installing the corresponding commercial solvers.

2-Wasserstein distance linear projections with an `L_1` penalty

Description

2-Wasserstein distance linear projections with an L_1 penalty

Usage

W2L1(
  X,
  Y = NULL,
  theta = NULL,
  penalty = c("lasso", "ols", "mcp", "elastic.net", "selection.lasso", "scad", "mcp.net",
    "scad.net", "grp.lasso", "grp.lasso.net", "grp.mcp", "grp.scad", "grp.mcp.net",
    "grp.scad.net", "sparse.grp.lasso"),
  method = c("projection", "selection.variable", "location.scale", "scale"),
  transport.method = transport_options(),
  epsilon = 0.05,
  OTmaxit = 0,
  model.size = NULL,
  lambda = numeric(0),
  nlambda = 100L,
  lambda.min.ratio = NULL,
  alpha = 1,
  gamma = 1,
  tau = 0.5,
  groups = numeric(0),
  scale.factor = numeric(0),
  penalty.factor = NULL,
  group.weights = NULL,
  maxit = 500L,
  tol = 1e-07,
  irls.maxit = 100L,
  irls.tol = 0.001,
  infimum.maxit = NULL,
  display.progress = FALSE
)

Arguments

X

An n x p matrix of covariates

Y

An n x s matrix of predictions

theta

optional parameter matrix for selection methods. Should be p x s.

penalty

Form of penalty. One of "lasso", "ols", "mcp", "elastic.net","selection.lasso", "scad", "mcp.net", "scad.net", "grp.lasso", "grp.lasso.net", "grp.mcp","grp.scad", "grp.mcp.net", "grp.scad.net", "sparse.grp.lasso"

method

"selection.variable" or "projection

transport.method

Method for calculating the Wasserstein distance. One of "exact", "sinkhorn", "greenkhorn","hilbert"

epsilon

Penalty parameter for Sinkhorn and Greenkhorn and optimal transport

OTmaxit

Maximum iterations for the optimal transport iterations

model.size

The maximum number of desired covariates. Defaults to the number of covariates.

lambda

Penalty parameter for lasso regression. See oem.

nlambda

Number of lambda values. See oem.

lambda.min.ratio

Minimum lambda ratio for self selected lambda. See oem.

alpha

elastic net mixing. See oem.

gamma

tuning parameters for SCAD and MCP. See oem.

tau

mixing parameter for sparse group lasso. See oem.

groups

A vector of grouping values. See oem.

scale.factor

Value to standardize the covariates by. Typically, is the standard deviation. Should have length 1 or length same as the number of covariates

penalty.factor

Penalty factor for OEM. See oem.

group.weights

Weights for group lasso. See oem.

maxit

Max iteration for OEM. See oem.

tol

Tolerance for OEM. See oem.

irls.maxit

IRLS max iterations for OEM. See oem.

irls.tol

IRLS tolerance for OEM. See oem.

infimum.maxit

Maximum number of iterations alternating optimization and Wasserstein distance calculation. Irrelevant for projection method.

display.progress

Display intermediate progress?

Value

Object of class WpProj

Infinity-Wasserstein Linear Projections With an L1 Penalty

Description

Infinity-Wasserstein Linear Projections With an L1 Penalty

Usage

WInfL1(
  X,
  Y,
  theta = NULL,
  penalty = c("none", "lasso", "mcp", "scad"),
  lambda = numeric(0),
  lambda.min.ratio = 1e-04,
  gamma = 1.5,
  nlambda = 10,
  solver = c("cone", "mosek", "gurobi"),
  options = list(solver_opts = NULL, init = NULL, tol = 1e-07, iter = 100),
  model.size = NULL,
  display.progress = FALSE,
  ...
)

Arguments

X

An n x p matrix of covariates

Y

An n x s matrix of predictions

theta

optional parameter matrix for selection methods. Should be p x s.

penalty

Form of penalty. One of "none", "lasso", "mcp","scad"

lambda

Penalty parameter for lasso regression.

lambda.min.ratio

Minimum lambda ratio for self selected lambda.

gamma

tuning parameters for SCAD and MCP.

nlambda

Number of lambda values.

solver

Which solver to use. One of "cone","mosek", or "gurobi". Note "mosek" and "gurobi" are commercial installers.

options

A list containing slots solver_opts, options for each solver, init, initial conditions fed into each solver, tol, tolerance for convergence, and iter, the maximum number of iterations

model.size

The maximum number of paramters to consider. Should be an integer greater than 1 and less than or equal to the number of covariates

display.progress

Whether to display progress. TRUE or FALSE

...

Additional arguments passed to the solver as needed

Value

A WpProj object

p-Wasserstein projections with an L0 penalty

Description

p-Wasserstein projections with an L0 penalty

Usage

WPL0(
  X,
  Y = NULL,
  theta,
  power = 2,
  method = c("selection.variable", "projection"),
  transport.method = transport_options(),
  epsilon = 0.05,
  OTmaxit = 0,
  parallel = NULL,
  ...
)

Arguments

X

matrix of covariates

Y

matrix of predictions

theta

optional matrix of coefficients from original model, if relevant

power

power of the Wasserstein distance

method

One of "selection.variable" or "projection". Methods decide whether covariate matrix in theta is preserved ("selection.variable") or if new projections are generated ("projection")

transport.method

Method for Wasserstein distance calculation. Should be one of the outputs of transport_options().

epsilon

hyperparameter for sinkhorn iterations

OTmaxit

max iteration for sinkhorn iterations

parallel

foreach backend

Value

WpProj object

p-Wasserstein Linear Projections With an `L_1` Penalty

Description

p-Wasserstein Linear Projections With an L_1 Penalty

Usage

WPL1(
  X,
  Y = NULL,
  theta = NULL,
  power = 2,
  penalty = c("lasso", "ols", "mcp", "elastic.net", "selection.lasso", "scad", "mcp.net",
    "scad.net", "grp.lasso", "grp.lasso.net", "grp.mcp", "grp.scad", "grp.mcp.net",
    "grp.scad.net", "sparse.grp.lasso"),
  model.size = NULL,
  lambda = numeric(0),
  nlambda = 100L,
  lambda.min.ratio = 1e-04,
  gamma = 1,
  alpha = 1,
  maxit = 500L,
  tol = 1e-07,
  ...
)

Arguments

X

matrix of covariates

Y

matrix of predictions

theta

optional parameter matrix for selection methods.

power

power of the Wasserstein distance

penalty

model.size

How many coefficients should final model have

lambda

penalty parameter

nlambda

number of lambdas to explore

lambda.min.ratio

minimum ratio of max to min lambda

gamma

Tuning parameter for SCAD and MCP methods

maxit

maximum iterations for optimization

tol

tolerance for convergence

...

arguments passed to other methods such as Wasserstein distance

Value

object of class WpProj

`W_p R^2` Function to Evaluate Performance

Description

This function will calculate p-Wasserstein distances between the predictions of interest and the projected model.

Usage

WPR2(
  predictions = NULL,
  projected_model,
  p = 2,
  method = "exact",
  base = NULL,
  ...
)

## S4 method for signature 'ANY,matrix'
WPR2(
  predictions = NULL,
  projected_model,
  p = 2,
  method = "exact",
  base = NULL,
  ...
)

## S4 method for signature 'ANY,distcompare'
WPR2(
  predictions = NULL,
  projected_model,
  p = 2,
  method = "exact",
  base = NULL,
  ...
)

## S4 method for signature 'ANY,list'
WPR2(
  predictions = NULL,
  projected_model,
  p = 2,
  method = "exact",
  base = NULL,
  ...
)

## S4 method for signature 'ANY,WpProj'
WPR2(
  predictions = NULL,
  projected_model,
  p = 2,
  method = "exact",
  base = NULL,
  ...
)

Arguments

predictions

Predictions of interest, likely from the original model

projected_model

A matrix of competing predictions, possibly from a WpProj fit, a WpProj fit itself, or a list of WpProj objects

p

Power of the Wasserstein distance to use in distance calculations

method

Method for calculating Wasserstein distance

base

The baseline result to compare to. If not provided, defaults to the model with no covariates and only an intercept.

...

Arguments passed to Wasserstein distance calculation. See wasserstein

Value

W_p R ^2 values

Examples

if (rlang::is_installed("stats")) {
# this example is not a true posterior estimation, but is used for illustration
n <- 32
p <- 10
s <- 21
x <- matrix( stats::rnorm(n*p), nrow = n, ncol = p )
beta <- (1:10)/10
y <- x %*% beta + stats::rnorm(n)
post_beta <- matrix(beta, nrow=p, ncol=s) + 
    matrix(rnorm(p*s), p, s) # not a true posterior
post_mu <- x %*% post_beta

fit <-  WpProj(X=x, eta=post_mu, power = 2.0)

out <- WPR2(predictions = post_mu, projected_model = fit, 
base = rowMeans(post_mu) # same as intercept only projection
)
}

p-Wasserstein distance projections using simulated annealing

Description

p-Wasserstein distance projections using simulated annealing

Usage

WPSA(
  X,
  Y = NULL,
  theta,
  power = 2,
  force = NULL,
  model.size = NULL,
  nvars = NULL,
  maxit = 1,
  temps = 1000,
  max.time = 3600,
  const = NULL,
  proposal = proposal.fun,
  options = list(method = c("selection.variable", "scale", "projection"),
    transport.method = transport_options(), energy.distribution = "boltzman",
    cooling.schedule = "Geman-Geman", proposal.method = "covariance", epsilon = 0.05,
    OTmaxit = 0),
  display.progress = FALSE,
  parallel = NULL,
  calc.theta = TRUE,
  xtx = NULL,
  xty = NULL,
  ...
)

Arguments

X

Covariate vector

Y

Predictions

theta

Optional matrix of parameters for generating predictions

power

Power of the Wasserstein distance

force

Any covariates to force into the model?

model.size

Maximum number of coefficients

nvars

The number of variables to explore. Should be an integer vector of model sizes. Default is NULL which will explore all models from 1 to model.size.

maxit

Maximum number of iterations

temps

Number of temperatures

max.time

Maximum time in seconds to run

const

Maximum value for simulated annealing distance

proposal

Proposal function. There is a default method but can provide your own with parameters xty, cur, idx, force, d, method

options

Options for simulated annealing

display.progress

Whether to display solver progress. TRUE or FALSE. Default is FALSE.

parallel

A foreach::foreach() backend

calc.theta

Should the model save the linear coefficients? TRUE or FALSE. Default is TRUE

xtx

precomputed crossproduct crossprod(X,X)

xty

precomputed crossprod(X, Y)

Value

An object of class WpProj

p-Wasserstein Distance Linear Projections Using a Stepwise Method

Description

p-Wasserstein Distance Linear Projections Using a Stepwise Method

Usage

WPSW(
  X,
  Y,
  theta,
  power = 2,
  force = NULL,
  direction = c("backward", "forward"),
  method = c("selection.variable", "scale", "projection"),
  transport.method = transport_options(),
  OTmaxit = 0,
  epsilon = 0.05,
  calc.theta = TRUE,
  model.size = NULL,
  parallel = NULL,
  display.progress = FALSE,
  ...
)

Arguments

X

matrix of covariates

Y

matrix of predictions

theta

optional parameter matrix for selection methods.

power

Power of the Wasserstein distance

force

Any covariates to force into the model?

direction

forward or backward selection

method

"selection.variable" or "projection

transport.method

Method for calculating the Wasserstein distance. Should be one of the outputs of transport_options().

OTmaxit

maximum number of iterations for the opt?imal transport methods

epsilon

hyperparameter if using sinkhorn iterations to approximate OT

calc.theta

should we get the linear coefficients

model.size

Maximum model size

parallel

foreach backend

display.progress

Display intermediate progress

Value

An object of class WpProj

p-Wasserstein Variable Importance

Description

This function will measure how much removing each covariate harms prediction accuracy.

Usage

WPVI(
  X,
  eta,
  theta,
  pred.fun = NULL,
  p = 2,
  ground_p = 2,
  transport.method = transport_options(),
  epsilon = 0.05,
  OTmaxit = 0,
  display.progress = FALSE,
  parallel = NULL
)

Arguments

X

Covariates

eta

Predictions from the estimated model

theta

Parameters from the estimated model.

pred.fun

A prediction function. must take variables x, theta as arguments: pred.fun(x,theta)

p

Power of Wasserstein distance

ground_p

Power of distance metric

transport.method

Transport methods. See transport_options() for more details.

epsilon

Hyperparameter for Sinkhorn iterations

OTmaxit

Maximum number of iterations for the Wasserstein method

display.progress

Display intermediate progress

parallel

a foreach backend if already created

Value

Returns an integer vector ranking covariate importance from most to least important.

Examples

n <- 128
p <- 10
s <- 99
x <- matrix(1, nrow = n, ncol = p )
beta <- (1:10)/10
y <- x %*% beta 
post_beta <- matrix(beta, nrow=p, ncol=s) 
post_mu <- x %*% post_beta

fit <-  WpProj(X=x, eta=post_mu, power = 2.0)
WPVI(X = x, eta = post_mu, theta = post_beta, transport.method = "hilbert")

p-Wasserstein Linear Projections

Description

This function will calculate linear projections from a set of predictions into the space of the covariates in terms of the p-Wasserstein distance.

Usage

WpProj(
  X,
  eta = NULL,
  theta = NULL,
  power = 2,
  method = c("L1", "binary program", "stepwise", "simulated annealing", "L0"),
  solver = c("lasso", "ecos", "lpsolve", "mosek"),
  options = NULL
)

Arguments

X

An n \times p matrix of covariates

eta

An n \times s matrix of predictions from a model

theta

An optional An p \times s parameter matrix for selection methods. Only makes sense if the original model is a linear model.

power

The power of the Wasserstein distance to use. Must be ⁠>= 1.0⁠. Will default to 2.0.

method

The algorithm to calculate the Wasserstein projections. One of "L1", "binary program", "IP", "stepwise","simulated annealing", or "L0". Will default to "L1" if not provided. See details for more information.

solver

Which solver to use? One of "lasso", "ecos", "lpsolve", or "mosek". See details for more information

options

Options passed to the particular method and desired solver. See details for more information.

Details

Methods

The WpProj function is a wrapper for the various Wasserstein projection methods. It is designed to be a one-stop shop for all Wasserstein projection methods. It will automatically choose the correct method and solver based on the arguments provided. It will also return a standardized output for all methods. Each method has its own set of options that can be passed to it. See the documentation for each method for more information.

For the L1 methods, see L1_method_options() for more information. For the binary program methods, see binary_program_method_options() for more information. For the stepwise methods, see stepwise_method_options() for more information. For the simulated annealing methods, see simulated_annealing_method_options() for more information.

In most cases, we recommend using the L1 methods or binary program methods. The L1 methods are the fastest and applicable to Wasserstein powers of any value greater than 1 and function as direct linear projections into the space of the covariates. The binary program methods instead preserve the coefficients of the original model if this is of interest, such as when the original model was already a linear model. The binary program will instead function as a way of turning on and off certain coefficients in a way that minimizes the Wasserstein distance between reduced and original models. Of note, we also have available an approximate binary program method using a lasso solver. This method is faster than the exact binary program method but is not guaranteed to find the optimal solution. It is recommended to use the exact binary program method if possible. See binary_program_method_options() for more information on how to set up the approximate method as some arguments for the lasso solver should be specified. For more information on how this works, please also see the referenced paper.

The stepwise, simulated annealing, and L0 methods also select covariates like the binary program methods but they can be slower. They are presented merely for comparison purposes given they were used in the original paper.

Wasserstein distances and powers

The Wasserstein distance is a measure of distance between two probability distributions. It is defined as:

W_p(\mu,\nu) = \left(\inf_{\pi \in \Pi(\mu,\nu)} \int_{\mathbb{R}^d \times \mathbb{R}^d} \|x-y\|^p d\pi(x,y)\right)^{1/p},

where \Pi(\mu,\nu) is the set of all joint distributions with marginals \mu and \nu. The Wasserstein distance is a generalization of the Euclidean distance, which is the case when p=2. In our function we have argument power that corresponds to the p of the equation above. The default power is 2.0 but any value greater than or equal to 1.0 is allowed. For more information, see the references.

The particular implementation of the Wasserstein distance is as follows. If \mu is the original prediction from the original model, then we seek to find a new prediction \nu that minimizes the Wasserstein distance between the two: \text{argmin}_\nu W_p(\mu,\nu).

Value

object of class WpProj, which is a list with the following slots:

call: The call to the function
theta: A list of the final parameter matrices for each returned model
fitted.values: A list of the fitted values for each returned model
power: The power of the Wasserstein distance used
method: The method used to calculate the Wasserstein projections
solver: The solver used to calculate the Wasserstein projections
niter: The number of iterations used to calculate the Wasserstein projections. Not all methods return a number of iterations so this may be NULL
nzero: The number of non zero coefficients in the final models

References

Dunipace, Eric and Lorenzo Trippa (2020) https://arxiv.org/abs/2012.09999.

Examples

if(rlang::is_installed("stats")) {
# note we don't generate believable data with real posteriors
# these examples are just to show how to use the function
n <- 32
p <- 10
s <- 21

# covariates and coefficients
x <- matrix( stats::rnorm( p * n ), nrow = n, ncol = p )
beta <- (1:10)/10

#outcome
y <- x %*% beta + stats::rnorm(n)

# fake posterior
post_beta <- matrix(beta, nrow=p, ncol=s) + stats::rnorm(p*s, 0, 0.1)
post_mu <- x %*% post_beta #posterior predictive distributions

# fit models
## L1 model
fit.p2     <-  WpProj(X=x, eta=post_mu, power = 2.0,
                   method = "L1", #default
                   solver = "lasso" #default
)

## approximate binary program
fit.p2.bp <-  WpProj(X=x, eta=post_mu, theta = post_beta, power = 2.0,
                   method = "binary program",
                   solver = "lasso" #default because approximate algorithm is faster
)

## compare performance by measuring distance from full model
dc <- distCompare(models = list("L1" = fit.p2, "BP" = fit.p2.bp))
if(rlang::is_installed(c("ggplot2","ggsci"))) {
plot(dc)
}

## compare performance by measuring the relative distance between a null model 
## and the predictions of interest as a pseudo R^2
r2.expect <- WPR2(predictions = post_mu, projected_model = dc) # can have negative values
r2.null  <- WPR2(projected_model = dc) # should be between 0 and 1
if(rlang::is_installed(c("ggplot2","ggsci"))) {
plot(r2.null)
}

## we can also examine how predictions change in the models for individual observations
if(rlang::is_installed(c("ggplot2","ggsci","ggridges"))) {
ridgePlot(fit.p2, index = 21, minCoef = 0, maxCoef = 10)
}
}

Options For Use With the Binary Program Method

Description

Options For Use With the Binary Program Method

Usage

binary_program_method_options(
  maxit = 500L,
  infimum.maxit = 100L,
  transport.method = transport_options(),
  epsilon = 0.05,
  OTmaxit = 100L,
  model.size = NULL,
  nvars = NULL,
  tol = 1e-07,
  display.progress = FALSE,
  parallel = NULL,
  solver.options = NULL
)

Arguments

maxit

The maximum iterations for the optimizer. Default is 500.

infimum.maxit

Maximum iterations to alternate binary program and Wasserstein distance calculations

transport.method

Method for Wasserstein distance calculation. Should be one the outputs of transport_options()

epsilon

A value > 0 for the penalty parameter of if using the Sinkhorn method

OTmaxit

The number of iterations to run the Wasserstein distance solvers.

model.size

What is the maximum number of coefficients to have in the final model. Default is NULL. If NULL, will find models from the minimum size, 0, to the number of columns in X.

nvars

The number of variables to explore. Should be an integer vector of model sizes. Default is NULL which will explore all models from 1 to model.size.

tol

The tolerance for convergence

display.progress

Logical. Should intermediate progress be displayed? TRUE or FALSE. Default is FALSE.

parallel

A cluster backend to be used by foreach::foreach(). See foreach::foreach() for details about how to set them up. The WpProj functions will register the cluster with the doParallel::registerDoParallel() function internally.

solver.options

Options to be passed on to the solver. See details

Details

This function will setup the default arguments used by the binary program method. Of note, for the argument solver.options, If using the "lasso" solver, you should provide arguments such as "penalty", "nlambda", "lambda.min.ratio", "gamma", and "lambda" in a list. A simple way to do this is to feed the output of the L1_method_options() function to the argument solver.options. This will tell the approximate solver, which uses a lasso method that then will project the parameters back to the \{0,1\} space. For the other solvers, you can see the options in the ECOS solver package, ECOSolveR::ecos.control(), and the options for the mosek solver, Rmosek::mosek().

Value

A list with names corresponding to each argument above.

Examples

binary_program_method_options()
# is using the lasso solver for the binary program method to give an approximate solution
binary_program_method_options(solver.options = L1_method_options(nlambda = 50L))

A Function to Combine `W_p R ^2` Objects

Description

Will combine W_p R ^2 objects into a single object.

Usage

combine.WPR2(...)

Arguments

...

List of W_p R^2 objects

Value

A vector of W_p R^2 objects

Examples

if (rlang::is_installed("stats")) {
n <- 128
p <- 10
s <- 99
x <- matrix( stats::rnorm( p * n ), nrow = n, ncol = p )
beta <- (1:10)/10
y <- x %*% beta + stats::rnorm(n)
post_beta <- matrix(beta, nrow=p, ncol=s) + stats::rnorm(p*s, 0, 0.1)
post_mu <- x %*% post_beta


fit1 <-  WpProj(X=x, eta=post_mu, theta = post_beta,
               power = 2.0, method = "binary program")
fit2 <-  WpProj(X=x, eta=post_mu, power = 2.0,
               options = list(penalty = "lasso")
)



out1 <- WPR2(predictions = post_mu, projected_model = fit1)
out2 <- WPR2(predictions = post_mu, projected_model = fit2)

combine <- combine.WPR2(out1, out2)
}

Combine distance calculations from the distCompare function

Description

Combine distance calculations from the distCompare function

Usage

combine.distcompare(...)

Arguments

...

distcompare objects that are the result of distCompare()

Value

an object of class combine.distcompare, the combined distcompare class objects as returned by distCompare() function

Compares Optimal Transport Distances Between WpProj and Original Models

Description

Will compare the Wasserstein distance between the original model and the WpProj model.

Usage

distCompare(
  models,
  target = list(parameters = NULL, predictions = NULL),
  power = 2,
  method = "exact",
  quantity = c("parameters", "predictions"),
  parallel = NULL,
  transform = function(x) {
     return(x)
 },
  ...
)

Arguments

models

A list of models from WpProj methods

target

The target to compare the methods to. Should be a list with slots "parameters" to compare the parameters and "predictions" to compare predictions

power

The power parameter of the Wasserstein distance.

method

Which approximation to the Wasserstein distance to use. Should be one of the outputs of transport_options().

quantity

Should the function target the "parameters" or the "predictions". Can choose both.

parallel

Parallel backend to use for the foreach package. See ⁠foreach::registerDoParallel(⁠ for more details.

transform

Transformation function for the predictions.

...

other options passed to the wasserstein() distance function

Details

For the data frames, dist is the Wasserstein distance, nactive is the number of active variables in the model, groups is the name distinguishing the model, and method is the method used to calculate the distance (i.e., exact, sinkhorn, etc.). If the list in models is named, these will be used as the group names otherwise the group names will be created based on the call from the WpProj method.

Value

an object of class distcompare with slots parameters, predictions, and p. The slots parameters and predictions are data frames. See the details for more info. The slot p is the power parameter of the Wasserstein distance used in the distance calculation.

Examples

if(rlang::is_installed("stats")) {
n <- 32
p <- 10
s <- 21
x <- matrix( stats::rnorm( p * n ), nrow = n, ncol = p )
beta <- (1:10)/10
y <- x %*% beta + stats::rnorm(n)
post_beta <- matrix(beta, nrow=p, ncol=s) + stats::rnorm(p*s, 0, 0.1)
post_mu <- x %*% post_beta

fit1 <-  WpProj(X=x, eta=post_mu, power = 2.0,
               options = list(penalty = "lasso")
)
fit2 <-  WpProj(X=x, eta=post_mu, theta = post_beta, power = 2.0,
               method = "binary program", solver = "lasso",
               options = list(solver.options = list(penalty = "mcp"))
)
dc <- distCompare(models = list("L1" = fit1, "BP" = fit2),
                 target = list(parameters = post_beta, predictions = post_mu))
if(rlang::is_installed(c("ggplot2","ggsci"))) {
plot(dc)
}
}

Plot Function for `W_p R^2` Objects

Description

Plot Function for W_p R^2 Objects

Usage

## S4 method for signature 'WPR2'
plot(
  x,
  xlim = NULL,
  ylim = NULL,
  linesize = 0.5,
  pointsize = 1.5,
  facet.group = NULL,
  ...
)

Arguments

x

A W_p R^2 object

xlim

x-axis limits

ylim

y-axis limits

linesize

Linesize for geom_line

pointsize

Point size for geom_point

facet.group

Group to do facet_grid by

...

Currently not used

Value

a ggplot2::ggplot() object

Examples

n <- 128
p <- 10
s <- 99
x <- matrix( stats::rnorm( p * n ), nrow = n, ncol = p )
beta <- (1:10)/10
y <- x %*% beta + stats::rnorm(n)
post_beta <- matrix(beta, nrow=p, ncol=s) + stats::rnorm(p*s, 0, 0.1)
post_mu <- x %*% post_beta

fit <-  WpProj(X=x, eta=post_mu, power = 2.0,
               options = list(penalty = "lasso")
)
obj <- WPR2(predictions = post_mu, projected_model = fit)
if (rlang::is_installed("ggplot2")) {
p <- plot(obj)
}

Plot 'combine.distcompare' Objects

Description

Plot 'combine.distcompare' Objects

Usage

## S4 method for signature 'combine_distcompare'
plot(x, ylim = NULL, ylabs = c(NULL, NULL), facet.group = NULL, ...)

Arguments

x

combine.distcompare objects resulting from the combine.distcompare() function

ylim

y-axis limits

ylabs

y-axis labels

facet.group

groups to facet by

...

additional plotting parameters like alpha

Value

An object of class plotcombinethat is a list of various diagnostic plots of the WpProj objects

Plot `distcompare` Objects

Description

Plot distcompare Objects

Usage

## S4 method for signature 'distcompare'
plot(
  x = NULL,
  models = NULL,
  ylim = NULL,
  ylabs = c(NULL, NULL),
  xlab = NULL,
  xlim = NULL,
  linesize = 0.5,
  pointsize = 1.5,
  facet.group = NULL,
  ...
)

Arguments

x

object of class distcompare

models

Can give list of WpProj outputs and have them turned into distcompare object for immediate plotting

ylim

Limits on y-axis

ylabs

Y-axis labels

xlab

X-axis labels

xlim

Limits of the x-axis

linesize

How big to make the lines?

pointsize

How big to make the points?

facet.group

Should the plots be turned into a facet_grid?

...

Additional options for the wasserstein distance if just inputing raw WpProj models

Value

A ggplot2 object

Plot the Rankings of the 'combine.distcompare' Objects

Description

Plot the Rankings of the 'combine.distcompare' Objects

Usage

plot_ranks(distances, ylim = NULL, ylabs = c(NULL, NULL), ...)

Arguments

distances

A combine.distcompare object resulting from the combine.distcompare() function

ylim

y-axis limits

ylabs

y-axis labels

...

additional plot arguments like alpha parameter in ggplot2

Value

object of class plotrank which is a list with slots "parameters" and "predictions"

Ranks `distcompare` Objects

Description

Ranks distcompare Objects

Usage

rank_distcompare(distances)

Arguments

distances

A distcompare object

Value

The ranks of a distcompare object as a list containing slots "predictions" and "parameters".

Objects exported from other packages

Description

These objects are imported from other packages. Follow the links below to see their documentation.

approxOT: transport_options, wasserstein

Ridge Plots for a Range of Coefficients

Description

This function will plot the distribution of predictions for a range of active coefficients

Usage

ridgePlot(
  fit,
  index = 1,
  minCoef = 1,
  maxCoef = 10,
  scale = 1,
  alpha = 0.5,
  full = NULL,
  transform = function(x) {
     x
 },
  xlab = "Predictions",
  bandwidth = NULL
)

Arguments

fit

A WpProj object or list of WpProj objects

index

The observation number to select. Can be a vector

minCoef

The minimum number of coefficients to use

maxCoef

The maximum number of coefficients to use

scale

How the densities should be scale

alpha

Alpha term from ggplot2 object

full

"True" prediction to compare to

transform

transform for predictions

xlab

x-axis label

bandwidth

Bandwidth for kernel

Value

a ggplot2::ggplot() plot

Examples

if(rlang::is_installed("stats")) {
n <- 128
p <- 10
s <- 99
x <- matrix(stats::rnorm(n*p), nrow = n, ncol = p )
beta <- (1:10)/10
y <- x %*% beta + stats::rnorm(n)
post_beta <- matrix(beta, nrow=p, ncol=s) + matrix(stats::rnorm(p*s, 0, 0.1), p, s)
post_mu <- x %*% post_beta
fit <-  WpProj(X=x, eta=post_mu, 
             power = 2
)
if(rlang::is_installed(c("ggplot2","ggsci","ggridges"))) {
ridgePlot(fit)
}
}

Options For Use With the Simulated Annealing Selection Method

Description

Options For Use With the Simulated Annealing Selection Method

Usage

simulated_annealing_method_options(
  force = NULL,
  method = c("binary program", "projection"),
  transport.method = transport_options(),
  OTmaxit = 0L,
  epsilon = 0.05,
  maxit = 1L,
  temps = 1000L,
  max.time = 3600,
  proposal.method = c("covariance", "uniform"),
  energy.distribution = c("boltzman", "bose-einstein"),
  cooling.schedule = c("Geman-Geman", "exponential"),
  model.size = NULL,
  nvars = NULL,
  display.progress = FALSE,
  parallel = NULL,
  calc.theta = TRUE,
  ...
)

Arguments

force

Any covariates to force into the model? Should be by column number or NULL if no variables to force into the model.

method

Should covariates be selected as an approximate "binary program" or should a projection method be used. Default is the approximate binary program.

transport.method

Method for Wasserstein distance calculation. Should be one the outputs of transport_options()

OTmaxit

The number of iterations to run the Wasserstein distance solvers.

epsilon

A value > 0 for the penalty parameter of if using the Sinkhorn method for optimal transport

maxit

Maximum number of iterations per temperature

temps

Number of temperatures to try

max.time

Maximum time in seconds to run the algorithm

proposal.method

The method to propose the next covariate to add. One of "covariance" or "random". "covariance" will randomly select from covariates with probability proportional to the absolute value of the covariance. "uniform" will select covariates uniformly at random.

energy.distribution

The energy distribution to use for evaluating proposals. One of "boltzman" or "bose-einstein". Default is "boltzman".

cooling.schedule

The schedule to use for cooling temperatures. One of "Geman-Geman" or "exponential". Default is "Geman-Geman".

model.size

How many coefficients should the maximum final model have? Ignored if nvars set.

nvars

What model sizes should one check? Should be a numeric vector with maximum less than number of variables or NULL. Default is NULL. Overrides model.size if is not NULL

display.progress

Logical. Should intermediate progress be displayed? TRUE or FALSE. Default is FALSE.

parallel

calc.theta

Return the linear coefficients? Default is TRUE.

...

Not used.

Value

A named list with the above arguments

Examples

simulated_annealing_method_options()

Options For Use With the Stepwise Selection Method

Description

Options For Use With the Stepwise Selection Method

Usage

stepwise_method_options(
  force = NULL,
  direction = c("backward", "forward"),
  method = c("binary program", "projection"),
  transport.method = transport_options(),
  OTmaxit = 0,
  epsilon = 0.05,
  model.size = NULL,
  display.progress = FALSE,
  parallel = NULL,
  calc.theta = TRUE,
  ...
)

Arguments

force

Any covariates to force into the model? Should be by column number or NULL if no variables to force into the model.

direction

"forward" or "backward" selection? Default is "backward"

method

Should covariates be selected as an approximate "binary program" or should a projection method be used. Default is the approximate binary program.

transport.method

Method for Wasserstein distance calculation. Should be one the outputs of transport_options()

OTmaxit

The number of iterations to run the Wasserstein distance solvers.

epsilon

A value > 0 for the penalty parameter of if using the Sinkhorn method for optimal transport

model.size

How many coefficients should the maximum final model have?

display.progress

Logical. Should intermediate progress be displayed? TRUE or FALSE. Default is FALSE.

parallel

calc.theta

Return the linear coefficients? Default is TRUE.

...

Not used

Value

A named list with the above arguments

Examples

stepwise_method_options()

WpProj: Linear p-Wasserstein Projections

Description

Author(s)

See Also

Run the Hahn-Carvalho Method

Description

Usage

Arguments

Value

References

Examples

Options For Use With the L0 Method

Description

Usage

Arguments

Value

Examples

Options For Use With the L1 Method

Description

Usage

Arguments

Value

See Also

Examples

Recognized L1 Penalties

Description

Usage

Value

Examples

1-Wasserstein projection

Description

Usage

Arguments

Value

2-Wasserstein distance selection by Integer Programming

Description

Usage

Arguments

Details

2-Wasserstein distance linear projections with an L_1 penalty

Description

Usage

Arguments

Value

Infinity-Wasserstein Linear Projections With an L1 Penalty

Description

Usage

Arguments

Value

p-Wasserstein projections with an L0 penalty

Description

Usage

Arguments

Value

p-Wasserstein Linear Projections With an L_1 Penalty

Description

Usage

Arguments

Value

See Also

W_p R^2 Function to Evaluate Performance

Description

Usage

Arguments

Value

Examples

p-Wasserstein distance projections using simulated annealing

Description

Usage

Arguments

Value

p-Wasserstein Distance Linear Projections Using a Stepwise Method

Description

Usage

Arguments

Value

p-Wasserstein Variable Importance

Description

Usage

Arguments

2-Wasserstein distance linear projections with an `L_1` penalty

p-Wasserstein Linear Projections With an `L_1` Penalty

`W_p R^2` Function to Evaluate Performance

A Function to Combine `W_p R ^2` Objects

Plot Function for `W_p R^2` Objects

Plot `distcompare` Objects

Ranks `distcompare` Objects