Implement reordering function to assess feature significance? #130

JohannesWiesner · 2022-01-18T13:05:01Z

In their paper from 2018, Xia et a. implemented a method to match canonical variates from resampled data sets to the original data set in order to be able to compute p-values for their canonical weights.

Page 12 (Methods Section):

As permutation could induce arbitrary axis rotation, which changes the order of canonical variates, or axis reflection, which causes a sign change for the weights, we matched the canonical variates resulting from permuted data matrices to the ones derived from the original data matrix by comparing the clinical loadings (v) (75. Mišić, B. et al. Network-level structure-function relationships in human
neocortex. Cereb. Cortex 26, 3285–96 (2016).)

Here's the code they implemented to achieve this:
https://github.com/cedricx/sCCA/blob/d5a2f4cb071bddd3f7d805e02ff27828b8494c66/sCCA/code/final/cca_functions.R#L191

Would it make sense to implement this method for cca-zoo? I am not even sure if this is a 'good' method having this issue in mind? But if I got it right, it's one thing to assess the overall significance of the canonical variates themselves and another thing to assess the significance of the feature weights on the canonical variates?

The text was updated successfully, but these errors were encountered:

JohannesWiesner · 2022-01-20T14:52:39Z

Implementation in Python:

# compute the absolute correlation matrix between the original and resampled 
# v weights
def cor_matrix(X,Y):
    X = (X - X.mean(axis=0)) / X.std(axis=0)
    Y = (Y - Y.mean(axis=0)) / Y.std(axis=0)
    cor_matrix = np.dot(X.T, Y) / X.shape[0]
    return cor_matrix

def reorderCCA(v_org,v_res,u_org,u_res,cors_org,cors_res):

    cor_matrix_abs = np.abs(cor_matrix(v_org,v_res))
    res_match = np.argmax(cor_matrix_abs,axis=1)
    res_cor = np.amax(cor_matrix_abs,axis=1)
    res_match_count = len(np.unique(res_match))
    
    # now reorder the u and v weights from the resampled CCA based on the correlation matches
    u_res_ordered = u_res[:,res_match]
    v_res_ordered = v_res[:,res_match]
    cors_final = cors_res[res_match]
    
    # for each v variate get the 'mean' sign for both original and resampled v
    signs_org = np.sign(np.mean(np.sign(v_org),axis=0))
    signs_res = np.sign(np.mean(np.sign(v_res_ordered),axis=0))
    
    # compute element wise product
    signs_prod = (signs_org * signs_res).reshape(-1,1)
    
    # change the signs of u and v of the resampled cca
    u_final = (u_res_ordered.T * signs_prod).T
    v_final = (v_res_ordered.T * signs_prod).T
    
    # if you can't exclusively assign one resampled v to one original v, return
    # nans
    if res_match_count < u_org.shape[1]:
        u_na = np.empty(u_org.shape)
        u_na[:] = np.nan
        v_na = np.empty(v_org.shape)
        v_na[:] = np.nan    
        cors_na = np.repeat(np.nan,u_org.shape[1])
        res_match_na = np.repeat(np.nan,u_org.shape[1])
        res_cor_na = np.repeat(np.nan,u_org.shape[1])
        res_one_reorder = [u_na,v_na,cors_na,res_match_na,res_cor_na]
    else: 
        res_one_reorder = [u_final,v_final,cors_final,res_match,res_cor]

    return res_one_reorder

JohannesWiesner · 2022-04-08T09:25:49Z

I am not sure if this function makes sense. Xia et al. refer to this paper (Mišić et al.)

Matching Decompositions From Original and Resampled Matrices
The decompositions of both permuted and bootstrap-resampled
data matrices are not guaranteed to produce a set of LVs comparable with the ones derived from the original data matrix, because
both types of resampling could induce arbitrary axis rotation (a
change in the order of LVs) or axis reflection (a sign change for
the weights). As a result, Procrustes rotations are used to match
the resampled LVs to the original LVs (McIntosh and Lobaugh
2004).

But here Procrustes Rotations are used to match resampled to original variates. I am not sure if the function above from Xia et al. is Procrustes Rotation?

JohannesWiesner · 2022-05-23T12:01:47Z

I guess the whole issue could also be renamed to: "Implement Procrustes Rotation for resampling purposes (bootstrapping, permutation)".

For completeness:

Here's the implementation from pyls:
https://github.com/netneurolab/pypyls/blob/d8a19d564cc5804249527b68c937df3a5fd8c7cc/pyls/compute.py#L240

This package might help:
https://procrustes.readthedocs.io/en/latest/

Maybe also this:
https://github.com/MaxHalford/prince#generalized-procrustes-analysis-gpa

JohannesWiesner · 2023-08-11T13:25:42Z

Any update on this? I might start to work on this with my colleagues on this :)

Also posted a question on this on CrossValidated

jameschapman19 · 2023-08-11T13:28:12Z

Apologies for never having responded! Must have caught me at a busy time. Not done anything myself, as ever welcome any contributions that are of practical use to people :)

jameschapman19 · 2023-08-11T13:28:56Z

Spent some time making sure everything works with scikit-learn permutation testing stuff so that might be relevant/helpful

JohannesWiesner · 2023-08-11T13:32:56Z

In a nutshell: If we are only considering one variate, we can simply take the absolute values of the loadings/weights. But once you got more variates, you may end up in situations where you have a different order of the variates but still similar weights (similar relationship of variables when only considering each variate in isolation). In this case, Procrustes rotation may help. Not sure, however, what some statisticians would say about this because a different order of variates might inherently also be an indicator of bad reliability.

jameschapman19 added enhancement New feature or request help wanted Extra attention is needed good first issue Good for newcomers labels May 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement reordering function to assess feature significance? #130

Implement reordering function to assess feature significance? #130

JohannesWiesner commented Jan 18, 2022

JohannesWiesner commented Jan 20, 2022

JohannesWiesner commented Apr 8, 2022

JohannesWiesner commented May 23, 2022

JohannesWiesner commented Aug 11, 2023

jameschapman19 commented Aug 11, 2023

jameschapman19 commented Aug 11, 2023

JohannesWiesner commented Aug 11, 2023

Implement reordering function to assess feature significance? #130

Implement reordering function to assess feature significance? #130

Comments

JohannesWiesner commented Jan 18, 2022

JohannesWiesner commented Jan 20, 2022

JohannesWiesner commented Apr 8, 2022

JohannesWiesner commented May 23, 2022

JohannesWiesner commented Aug 11, 2023

jameschapman19 commented Aug 11, 2023

jameschapman19 commented Aug 11, 2023

JohannesWiesner commented Aug 11, 2023