我有以下数据框
In [62]: df Out[62]: coverage name reports year Cochice 45 Jason 4 2012 Pima 214 Molly 24 2012 Santa Cruz 212 Tina 31 2013 Maricopa 72 Jake 2 2014 Yuma 85 Amy 3 2014
基本上我可以如下过滤行
df[df["coverage"] > 30
我可以如下删除/删除一行
df.drop(['Cochice', 'Pima'])
但是我想根据条件删除一定数量的行,我该怎么做?
最好的是boolean indexing
但需要反转条件-使所有值相等且更高,如下所示72
:
print (df[df["coverage"] >= 72]) coverage name reports year Pima 214 Molly 24 2012 Santa Cruz 212 Tina 31 2013 Maricopa 72 Jake 2 2014 Yuma 85 Amy 3 2014
与ge
功能相同:
print (df[df["coverage"].ge(72)]) coverage name reports year Pima 214 Molly 24 2012 Santa Cruz 212 Tina 31 2013 Maricopa 72 Jake 2 2014 Yuma 85 Amy 3 2014
另一种可能的解决方案是通过~
以下方法反转掩码:
print (df["coverage"] < 72) Cochice True Pima False Santa Cruz False Maricopa False Yuma False Name: coverage, dtype: bool print (~(df["coverage"] < 72)) Cochice False Pima True Santa Cruz True Maricopa True Yuma True Name: coverage, dtype: bool print (df[~(df["coverage"] < 72)]) coverage name reports year Pima 214 Molly 24 2012 Santa Cruz 212 Tina 31 2013 Maricopa 72 Jake 2 2014 Yuma 85 Amy 3 2014