对groupand函数使用pandas DataFrames
将数据放入列表列表中,每个内部列表将是数据帧中的一行.
In[1]: mydata = [['Team1', 'Player1', 'idTrip13', 133], ['Team2', 'Player333', 'idTrip10', 18373], ['Team3', 'Player22', 'idTrip12', 17338899], ['Team2', 'Player293','idTrip02', 17656], ['Team3', 'Player20', 'idTrip11', 1883], ['Team1', 'Player1', 'idTrip19', 19393]] df = pd.DataFrame(mydata, columns = ['team', 'player', 'trips', 'time']) df Out[1]: team player trips time 0 Team1 Player1 idTrip13 133 1 Team2 Player333 idTrip10 18373 2 Team3 Player22 idTrip12 17338899 3 Team2 Player293 idTrip02 17656 4 Team3 Player20 idTrip11 1883 5 Team1 Player1 idTrip19 19393
调用groupby()
,传递您希望用作石斑鱼的列,并将功能应用于组.
例子
防爆.1查找每个团队进行的旅行次数.team
是石斑鱼,我们count()
在柱上应用函数['trips']
.
In[2]: trip_count = df.groupby(by = ['team'])['trips'].count() trip_count Out[2]: team Team1 2 Team2 2 Team3 2 Name: trips, dtype: int64
防爆.2(多列):查找团队中每位玩家所花费的总时间.我们使用2列['team', 'player']
作为石斑鱼,并sum()
在列上应用该功能['time']
.
In[3]: trip_time = df.groupby(by = ['team', 'player'])['time'].sum() trip_time Out[3]: team player Team1 Player1 19526 Team2 Player293 17656 Player333 18373 Team3 Player20 1883 Player22 17338899 Name: time, dtype: int64
防爆.3 (多种功能):对于团队中的每个玩家,查找旅行总次数和旅行总时间.
player_total = df.groupby(by = ['team', 'player']).agg({'time' : 'sum', 'trips' : 'count'}) player_total Out[4]: trips time team player Team1 Player1 2 19526 Team2 Player293 1 17656 Player333 1 18373 Team3 Player20 1 1883 Player22 1 17338899