我有一个'DataFrame`偶尔有缺失值,看起来像这样:
Monday Tuesday Wednesday ================================================ Mike 42 NaN 12 Jenna NaN NaN 15 Jon 21 4 1
我想新添加column
到我的数据帧在那里我会计算所有的平均值columns
为每row
.
意思,因为Mike
,我需要
(df['Monday'] + df['Wednesday'])/2
,但因为Jenna
,我只是使用df['Wednesday amt.']/1
有没有人知道解决由于缺失值导致的这种变化并计算平均值的最佳方法?
你可以简单地说:
df['avg'] = df.mean(axis=1) Monday Tuesday Wednesday avg Mike 42 NaN 12 27.000000 Jenna NaN NaN 15 15.000000 Jon 21 4 1 8.666667
因为.mean()
默认情况下忽略缺失值:请参阅docs.
要选择子集,您可以:
df['avg'] = df[['Monday', 'Tuesday']].mean(axis=1) Monday Tuesday Wednesday avg Mike 42 NaN 12 42.0 Jenna NaN NaN 15 NaN Jon 21 4 1 12.5