Similarity-driven multiview linear reconstruction model (simlr) for N modalities

simlr minimizes reconstruction error across related modalities. That is, simlr will reconstruct each modality matrix from a basis set derived from the other modalities. The basis set can be derived from SVD, ICA or a simple sum of basis representations. This function produces dataset-wide multivariate beta maps for each of the related matrices. The multivariate beta maps are regularized by user input matrices that encode relationships between variables. The idea is overall similar to canonical correlation analysis but generalizes the basis construction and to arbitrary numbers of modalities.

simlr(
  voxmats,
  smoothingMatrices,
  iterations = 10,
  sparsenessQuantiles,
  positivities,
  initialUMatrix,
  mixAlg = c("ica", "svd", "avg", "rrpca-l", "rrpca-s", "pca", "stochastic"),
  orthogonalize = FALSE,
  repeatedMeasures = NA,
  lineSearchRange = c(-10000000000, 10000000000),
  lineSearchTolerance = 0.00000001,
  randomSeed,
  constraint = c("none", "Grassmann", "Stiefel"),
  energyType = c("regression", "normalized", "cca", "ucca", "lowRank"),
  vmats,
  connectors = NULL,
  optimizationStyle = c("lineSearch", "mixed", "greedy"),
  scale = c("sqrtnp", "np", "centerAndScale", "center", "norm", "none", "impute",
    "eigenvalue", "robust"),
  expBeta = 0.5,
  verbose = FALSE
)

Arguments

voxmats	A list that contains the named matrices. Note: the optimization will likely perform much more smoothly if the input matrices are each scaled to zero mean unit variance e.g. by the `scale` function.
smoothingMatrices	list of (sparse) matrices that allow parameter smoothing/regularization. These should be square and same order and size of input matrices.
iterations	number of gradient descent iterations
sparsenessQuantiles	vector of quantiles to control sparseness - higher is sparser
positivities	vector that sets for each matrix if we restrict to positive or negative solution (beta) weights. choices are positive, negative or either as expressed as a string.
initialUMatrix	list of initialization matrix size `n` by `k` for each modality. Otherwise, pass a single scalar to control the number of basis functions in which case random initialization occurs. One may also pass a single initialization matrix to be used for all matrices. If this is set to a scalar, or is missing, a random matrix will be used.
mixAlg	'svd', 'ica', 'rrpca-l', 'rrpca-s', 'stochastic', 'pca' or 'avg' denotes the algorithm employed when estimating the mixed modality bases
orthogonalize	boolean to control whether we orthogonalize the solutions explicitly
repeatedMeasures	list of repeated measurement identifiers. this will allow estimates of per identifier intercept.
lineSearchRange	lower and upper limit used in `optimize`
lineSearchTolerance	tolerance used in `optimize`, will be multiplied by each matrix norm such that it scales appropriately with input data
randomSeed	controls repeatability of ica-based decomposition
constraint	one of none, Grassmann or Stiefel
energyType	one of regression, normalized, lowRank, cca or ucca
vmats	optional initial `v` matrix list
connectors	a list ( length of projections or number of modalities ) that indicates which modalities should be paired with current modality
optimizationStyle	one of `c("mixed","greedy","linesearch")`
scale	options to standardize each matrix. e.g. divide by the square root of its number of variables (Westerhuis, Kourti, and MacGregor 1998), divide by the number of variables or center or center and scale or ... (see code). can be a vector which will apply each strategy in order.
expBeta	if greater than zero, use exponential moving average on gradient.
verbose	boolean to control verbosity of output - set to level `2` in order to see more output, specifically the gradient descent parameters.

Value

A list of u, x, y, z etc related matrices.

Author

BB Avants.

Examples


set.seed(1500)
nsub = 25
npix = c(100,200,133)
nk = 5
outcome = matrix(rnorm( nsub * nk ),ncol=nk)
outcome1 = matrix(rnorm( nsub * nk ),ncol=nk)
outcome2 = matrix(rnorm( nsub * nk ),ncol=nk)
outcome3 = matrix(rnorm( nsub * nk ),ncol=nk)
view1tx = matrix( rnorm( npix[1]  * nk ), nrow=nk )
view2tx = matrix( rnorm( npix[2]  * nk ), nrow=nk )
view3tx = matrix( rnorm( npix[3]  * nk ), nrow=nk )
mat1 = (outcome %*% t(outcome1) %*% (outcome1)) %*% view1tx
mat2 = (outcome %*% t(outcome2) %*% (outcome2)) %*% view2tx
mat3 = (outcome %*% t(outcome3) %*% (outcome3)) %*% view3tx
matlist = list( vox = mat1, vox2 = mat2, vox3 = mat3 )
result = simlr( matlist )
#> initializing with random matrix:  3 columns
p1 = mat1 %*% (result$v[[1]])
p2 = mat2 %*% (result$v[[2]])
p3 = mat3 %*% (result$v[[3]])
regs = regularizeSimlr( matlist )
result2 = simlr( matlist )
#> initializing with random matrix:  3 columns
pred1 = predictSimlr( matlist, result )
pred2 = predictSimlr( matlist, result2 )

# compare to permuted data
s1 = sample( 1:nsub)
s2 = sample( 1:nsub)
resultp = simlr( list( vox = mat1, vox2 = mat2[s1,], vox3 = mat3[s2,] ) )
#> initializing with random matrix:  3 columns
p1p = mat1 %*% (resultp$v[[1]])
p2p = mat2[s1,] %*% (resultp$v[[2]])
p3p = mat3[s2,] %*% (resultp$v[[3]])

# compare to SVD
svd1 = svd( mat1, nu=nk, nv=0 )$u
svd2 = svd( mat2, nu=nk, nv=0 )$u
svd3 = svd( mat3, nu=nk, nv=0 )$u

# real
range(cor(p1,p2))
#> [1] -0.6589783  0.5133050
range(cor(p1,p3))
#> [1] -0.6737101  0.7049057
range(cor(p3,p2))
#> [1] -0.4720501  0.7041769

# permuted
range(cor(p1p,p2p))
#> [1] -0.1900885  0.2029698
range(cor(p1p,p3p))
#> [1] -0.2164204  0.2130339
range(cor(p3p,p2p))
#> [1] -0.4718212  0.4590747

# svd
print( range(cor( svd1,svd2) ))
#> [1] -0.6848291  0.6203092

resultp = simlr(list( vox = mat1, vox2 = mat2[s1,], vox3 = mat3[s2,] ),
   initialUMatrix = nk , verbose=TRUE, iterations=5,
   energyType = "normalized" )
#> initializing with random matrix:  5 columns
#> [1] "     <0> BUILD-V <0> BUILD-V <0> BUILD-V <0> BUILD-V <0>    "
#> [1] "initialDataTerm: 1.41238902910432  <o> mixer: ica  <o> E:  normalized"
#> [1] "Iteration: 1 bestEv: 1.38286025921981 bestIt: 1 CE: 1.38286025921981"
#> [1] "Iteration: 2 bestEv: 1.38286025921981 bestIt: 1 CE: 1.39833267218968"
#> [1] "Iteration: 3 bestEv: 1.38286025921981 bestIt: 1 CE: 1.44431740041708"
#> [1] "Iteration: 4 bestEv: 1.38286025921981 bestIt: 1 CE: 1.40084763843311"
#> [1] "Iteration: 5 bestEv: 1.38286025921981 bestIt: 1 CE: 1.45149701654717"

Similarity-driven multiview linear reconstruction model (simlr) for N modalities

Arguments

Value

See also

Author

Examples