Python中两个不同大小矩阵的相关性

您所在的位置：网站首页 › numpy矩阵库 › Python中两个不同大小矩阵的相关性

Python中两个不同大小矩阵的相关性

2023-05-09 07:28| 来源: 网络整理| 查看: 265

我有两个矩阵p (500x10000)和h (500x256)，我需要用Python计算相关性。

在Matlab中，我使用corr()函数，没有任何问题: myCorrelation = corr( p，h )；

在numpy中，我尝试了np.corrcoef( p, h )：

File "/usr/local/lib/python2.7/site-packages/numpy/core/shape_base.py", line 234, in vstack return _nx.concatenate([atleast_2d(_m) for _m in tup], 0) ValueError: all the input array dimensions except for the concatenation axis must match exactly

我还试过np.correlate( p, h )：

File "/usr/local/lib/python2.7/site-packages/numpy/core/numeric.py", line 975, in correlate return multiarray.correlate2(a, v, mode) ValueError: object too deep for desired array

投入：

pw.shape = (500, 10000) hW.shape = (500, 256)

首先，我试过这样做：

myCorrelationMatrix, _ = scipy.stats.pearsonr( pw, hW )

结果：

myCorrelationMatrix, _ = scipy.stats.pearsonr( pw, hW ) File "/usr/local/lib/python2.7/site-packages/scipy/stats/stats.py", line 3019, in pearsonr r_num = np.add.reduce(xm * ym) ValueError: operands could not be broadcast together with shapes (500,10000) (500,256)

并尝试了这个：

myCorrelationMatrix = corr2_coeff( pw, hW )

其中，corr2_coeff根据1是：

def corr2_coeff(A,B) : # Rowwise mean of input arrays & subtract from input arrays themeselves A_mA = A - A.mean(1)[:,None] B_mB = B - B.mean(1)[:,None] # Sum of squares across rows ssA = (A_mA**2).sum(1); ssB = (B_mB**2).sum(1); # Finally get corr coeff return np.dot(A_mA,B_mB.T)/np.sqrt(np.dot(ssA[:,None],ssB[None]))

其结果是：

myCorrelationMatrix, _ = corr2_coeff( powerTraces, hW ) File "./myScript.py", line 175, in corr2_coeff return np.dot(A_mA,B_mB.T)/np.sqrt(np.dot(ssA[:,None],ssB[None])) ValueError: shapes (500,10000) and (256,500) not aligned: 10000 (dim 1) != 256 (dim 0)

最后尝试了这个：

myCorrelationMatrix = corr_coeff( pw, hW )

其中，corr_coeff根据2是：

def corr_coeff(A,B) : # Get number of rows in either A or B N = B.shape[0] # Store columnw-wise in A and B, as they would be used at few places sA = A.sum(0) sB = B.sum(0) # Basically there are four parts in the formula. We would compute them one-by-one p1 = N*np.einsum('ij,ik->kj',A,B) p2 = sA*sB[:,None] p3 = N*((B**2).sum(0)) - (sB**2) p4 = N*((A**2).sum(0)) - (sA**2) # Finally compute Pearson Correlation Coefficient as 2D array pcorr = ((p1 - p2)/np.sqrt(p4*p3[:,None])) # Get the element corresponding to absolute argmax along the columns # out = pcorr[np.nanargmax(np.abs(pcorr),axis=0),np.arange(pcorr.shape[1])] return pcorr

结果是：

RuntimeWarning: invalid value encountered in sqrt pcorr = ((p1 - p2)/np.sqrt(p4*p3[:,None])) RuntimeWarning: invalid value encountered in divide pcorr = ((p1 - p2)/np.sqrt(p4*p3[:,None]))

更新

这不是一个复制，我尝试了您在Computing the correlation coefficient between two multi-dimensional arrays和Efficient pairwise correlation for two matrices of features上给出的这两种方法，但它们都没有起作用。

【本文地址】

Python中两个不同大小矩阵的相关性

Python中两个不同大小矩阵的相关性

今日新闻

推荐新闻