给定以下形式的方形pandas DataFrame:
a b c a 1 .5 .3 b .5 1 .4 c .3 .4 1
我怎么才能melt
得到上三角形
Row Column Value a a 1 a b .5 a c .3 b b 1 b c .4 c c 1 #Note the combination a,b is only listed once. There is no b,a listing
我对一个习惯性的熊猫解决方案更感兴趣,一个自定义索引器很容易手工编写...提前感谢您的考虑和响应.
首先,我转换的较低值df
,以NaN
通过where
和numpy.triu
,然后stack
,reset_index
并设置列名:
import numpy as np print df a b c a 1.0 0.5 0.3 b 0.5 1.0 0.4 c 0.3 0.4 1.0 print np.triu(np.ones(df.shape)).astype(np.bool) [[ True True True] [False True True] [False False True]] df = df.where(np.triu(np.ones(df.shape)).astype(np.bool)) print df a b c a 1 0.5 0.3 b NaN 1.0 0.4 c NaN NaN 1.0 df = df.stack().reset_index() df.columns = ['Row','Column','Value'] print df Row Column Value 0 a a 1.0 1 a b 0.5 2 a c 0.3 3 b b 1.0 4 b c 0.4 5 c c 1.0
根据@jezrael的解决方案构建,布尔索引将是一种更明确的方法:
import numpy from pandas import DataFrame df = DataFrame({'a':[1,.5,.3],'b':[.5,1,.4],'c':[.3,.4,1]},index=list('abc')) print df,'\n' keep = np.triu(np.ones(df.shape)).astype('bool').reshape(df.size) print df.stack()[keep]
输出:
a b c a 1.0 0.5 0.3 b 0.5 1.0 0.4 c 0.3 0.4 1.0 a a 1.0 b 0.5 c 0.3 b b 1.0 c 0.4 c c 1.0 dtype: float64