--- title: "TPLS_example2" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{TPLS_example2} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Hello, from Arthur This script shows how one can use T-PLS to assess cross-validation performance. To see how to use T-PLS to build a predictor, see TPLS_example1. ## Loading library and tutorial data ```{r setup} library(TPLSr) attach(TPLSdat) ``` X is the single trial betas. It has 3714 columns, each of which corresponds to a voxel. Y is binary variable to be predicted. In this case, the Y was whether the participant chose left or right button. Hopefully, when we create whole-brain predictor, we should be able to see left and right motor areas. subj is a numerical variable that tells us the subject number that each observation belongs to. In this dataset, there are only 3 subjects. run is a numerical variable that tells us the scanner run that each observation belongs to. In this dataset, each of the 3 subjects had 8 scan runs. ## Cross Validation There are only 3 subjects in this dataset, so we will do 3-fold CV. This entails repeating the following step 3 times * 1. Divide the data into training and testing. In this case, 2 subjects in training and 1 subject in testing. * 2. Using just the training data (i.e., 2 subjects), do secondary cross-validation to choose best tuning parameter * 3. Based on the best tuning parameter, fit a whole-brain predictor using all training data (2 subjects). * 4. Assess how well the left out subject is predicted * 5. Repeat 1~4 ```{r} ACCstorage <- rep(NA, 3) for (i in 1:3) { # primary cross-validation fold test = subj==i; train = !test # perform nested cross-validation within training data cvmdl = TPLS_cv(X[train,],Y[train],subj[train]) cvstats = evalTuningParam(cvmdl,"Pearson",X[train,],Y[train],1:25,seq(0,1,0.05),run[train]) # fit T-PLS model using all training data based on best tuning parameter mdl = TPLS(X[train,],Y[train]) # predict the testing subject score = TPLSpredict(mdl,cvstats$compval_best,cvstats$threshval_best,X[test,]) prediction = 1*(score > 0.5) # assess performance of prediction ACCstorage[i] = mean(prediction==Y[test]) } mean(ACCstorage) # out-of-sample CV performance ```