13赞

Pandas:依赖于另一个值的列

作者：可爱的天使keven_464 | 2023-09-07 16:59

如何解决《Pandas:依赖于另一个值的列》经验，为你挑选了1个好方法。

我有一个像下面这样的Pandas数据帧:

   col1  col2  col3  col4
0     5     1    11     9
1     2     3    14     7
2     6     5    54     8
3    11     2    67    44
4    23     8     2    23
5     1     5     9     8
6     9     7    45    71

我想创建一个第5列(col5),它取决于col1的值,并取其他列之一的值.

这是我希望它看起来的样子,但我遇到了一些问题.

if col1 < 3:
   col5 == col2
elif col1 < 7 & col1 >= 3:
   col5 == col3
elif col1 >= 7 & col1 < 50:
   col5 == col4

哪个会产生以下数据帧:

   col1  col2  col3  col4  col5
0     5     1    11     9    11
1     2     3    14     7     3
2     6     5    54     8    54
3    11     2    67    44    44
4    23     8     2    23    23
5    97     5     9     8     8
6     9     7    45    71    71

如果您有任何问题,请提前致谢并告诉我

1> jezrael..：

您可以使用多个numpy.where,如果没有条件是True(col1 => 50)添加了最后一个值1:

df['col5'] = np.where(df['col1'] <3, df['col2'], 
             np.where((df['col1'] <7) & (df['col1'] >=3 ), df['col3'], 
             np.where((df['col1'] >=7) & (df['col1'] <50 ), df['col4'], 1))) 
print (df)
   col1  col2  col3  col4  col5
0     5     1    11     9    11
1     2     3    14     7     3
2     6     5    54     8    54
3    11     2    67    44    44
4    23     8     2    23    23
5    97     5     9     8     1
6     9     7    45    71    71

通过更改值编辑:

如果需要col4所有值>=7:

df['col5'] = np.where(df['col1'] <3, df['col2'], 
             np.where((df['col1'] <7) & (df['col1'] >=3 ), df['col3'], df['col4']))
print (df)
   col1  col2  col3  col4  col5
0     5     1    11     9    11
1     2     3    14     7     3
2     6     5    54     8    54
3    11     2    67    44    44
4    23     8     2    23    23
5    97     5     9     8     8
6     9     7    45    71    71

时间len(df)=7000:

In [441]: %timeit df['col51'] = np.where(df['col1'] <3, df['col2'], np.where((df['col1'] <7) & (df['col1'] >=3 ), df['col3'], df['col4']))
The slowest run took 5.31 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 1.25 ms per loop

In [442]: %timeit df["col52"] = df.apply(lambda x: col52(x), axis=1)
1 loop, best of 3: 552 ms per loop

In [443]: %timeit df["col53"] = [col53(c1,c2,c3,c4) for c1,c2,c3,c4 in zip(df.col1,df.col2,df.col3,df.col4)]
100 loops, best of 3: 9.87 ms per loop

时间在 len(df)=70k

In [446]: %timeit df['col51'] = np.where(df['col1'] <3, df['col2'], np.where((df['col1'] <7) & (df['col1'] >=3 ), df['col3'], df['col4']))
100 loops, best of 3: 2.5 ms per loop

In [447]: %timeit df["col52"] = df.apply(lambda x: col52(x), axis=1)
1 loop, best of 3: 5.36 s per loop

In [448]: %timeit df["col53"] = [col53(c1,c2,c3,c4) for c1,c2,c3,c4 in zip(df.col1,df.col2,df.col3,df.col4)]
10 loops, best of 3: 96.3 ms per loop

时间代码:

#change 1000 to 10000 for 70k
df = pd.concat([df]*1000).reset_index(drop=True)

def col52(x):
    if x["col1"] < 3:
        return x["col2"]
    elif x["col1"] >=3 and x["col1"] < 7:
        return x["col3"]
    elif x["col1"] >= 7 and x["col1"] < 50:
        return x["col4"] 
def col53(c1,c2,c3,c4):
    if c1 < 3:
        return c2
    elif c1 >=3 and c1 < 7:
        return c3
    elif c1>= 7 and c1< 50:
        return c4    

df['col51'] = np.where(df['col1'] <3, df['col2'], np.where((df['col1'] <7) & (df['col1'] >=3 ), df['col3'], df['col4']))       
df["col52"] = df.apply(lambda x: col52(x), axis=1)
df["col53"] = [col53(c1,c2,c3,c4) for c1,c2,c3,c4 in zip(df.col1,df.col2,df.col3,df.col4)]
print (df)

推荐阅读

程序员
Java无法读取代码的输出

如何解决《Java无法读取代码的输出》经验，为你挑选了1个好方法。 ... [详细]
程序员
PyCharm当前的工作目录

如何解决《PyCharm当前的工作目录》经验，为你挑选了2个好方法。 ... [详细]
程序员
JS中的半结肠问题

如何解决《JS中的半结肠问题》经验，为你挑选了0个好方法。 ... [详细]
程序员
升级到angular2.0.0-beta.0后,router-link指令被破坏

如何解决《升级到angular2.0.0-beta.0后,router-link指令被破坏》经验，为你挑选了1个好方法。 ... [详细]
程序员
如何生成文件路径？

如何解决《如何生成文件路径？》经验，为你挑选了2个好方法。 ... [详细]
程序员
Java中的getValue

如何解决《Java中的getValue》经验，为你挑选了1个好方法。 ... [详细]
程序员
Redux:使用异步中间件与成功函数的调度操作

如何解决《Redux:使用异步中间件与成功函数的调度操作》经验，为你挑选了1个好方法。 ... [详细]
程序员
我可以在不将结果绑定到let/match/for语句中的新变量的情况下构造元组吗？

如何解决《我可以在不将结果绑定到let/match/for语句中的新变量的情况下构造元组吗？》经验，为你挑选了1个好方法。 ... [详细]
程序员
CodeIgniter Active Records比较同一个mysql表的两列

如何解决《CodeIgniterActiveRecords比较同一个mysql表的两列》经验，为你挑选了1个好方法。 ... [详细]
程序员
Dapper.SimpleCRUD插入没有身份的问题

如何解决《Dapper.SimpleCRUD插入没有身份的问题》经验，为你挑选了1个好方法。 ... [详细]
程序员
弹性搜索-聚合按索引分组

如何解决《弹性搜索-聚合按索引分组》经验，为你挑选了1个好方法。 ... [详细]
程序员
如何启用UITextView来接收粘贴的图像

如何解决《如何启用UITextView来接收粘贴的图像》经验，为你挑选了1个好方法。 ... [详细]
程序员
按X列对数据帧进行分组

如何解决《按X列对数据帧进行分组》经验，为你挑选了1个好方法。 ... [详细]
程序员
为什么sagas(又名流程管理员)包含一个内部状态,为什么它们会持久存在于事件存储中？

如何解决《为什么sagas(又名流程管理员)包含一个内部状态,为什么它们会持久存在于事件存储中？》经验，为你挑选了0个好方法。 ... [详细]
程序员
如何在magento中创建自己的日志文件

如何解决《如何在magento中创建自己的日志文件》经验，为你挑选了1个好方法。 ... [详细]
程序员
自制PHP5.6 memcached安装错误

如何解决《自制PHP5.6memcached安装错误》经验，为你挑选了1个好方法。 ... [详细]
程序员
从具有未知键的嵌套哈希中选择一个值

如何解决《从具有未知键的嵌套哈希中选择一个值》经验，为你挑选了1个好方法。 ... [详细]
程序员
如何在文件解密期间解决"EVP_DecryptFInal_ex:bad decrypt"

如何解决《如何在文件解密期间解决"EVP_DecryptFInal_ex:baddecrypt"》经验，为你挑选了3个好方法。 ... [详细]
程序员
在启用UseDeveloperExceptionPage的情况下获取错误消息的空白页

如何解决《在启用UseDeveloperExceptionPage的情况下获取错误消息的空白页》经验，为你挑选了1个好方法。 ... [详细]
程序员
在iPhone 6的键盘下隐藏的自动校正

如何解决《在iPhone6的键盘下隐藏的自动校正》经验，为你挑选了0个好方法。 ... [详细]

可爱的天使keven_464

这个屌丝很懒，什么也没留下！

关注作者

Tags | 热门标签

RankList | 热门文章