我的简化数据如下所示:
set.seed(1453); x = sample(0:1, 10, TRUE) date = c('2016-01-01', '2016-01-05', '2016-01-07', '2016-01-12', '2016-01-16', '2016-01-20', '2016-01-20', '2016-01-25', '2016-01-26', '2016-01-31') df = data.frame(x, date = as.Date(date)) df x date 1 2016-01-01 0 2016-01-05 1 2016-01-07 0 2016-01-12 0 2016-01-16 1 2016-01-20 1 2016-01-20 0 2016-01-25 0 2016-01-26 1 2016-01-31
我想计算x == 1
在指定时间段内出现的次数,例如距离当前日期的14天和30天(但不包括当前条目,如果是的话x == 1
.所需的输出如下所示:
solution x date x_plus14 x_plus30 1 2016-01-01 1 3 0 2016-01-05 1 4 1 2016-01-07 2 3 0 2016-01-12 2 3 0 2016-01-16 2 3 1 2016-01-20 2 2 1 2016-01-20 1 1 0 2016-01-25 1 1 0 2016-01-26 1 1 1 2016-01-31 0 0
理想情况下,我希望这样dplyr
,但这不是必须的.任何想法如何实现这一目标?非常感谢你的帮助!
添加另一种方法基于findInterval
:
cs = cumsum(df$x) # cumulative number of occurences data.frame(df, plus14 = cs[findInterval(df$date + 14, df$date, left.open = TRUE)] - cs, plus30 = cs[findInterval(df$date + 30, df$date, left.open = TRUE)] - cs) # x date plus14 plus30 #1 1 2016-01-01 1 3 #2 0 2016-01-05 1 4 #3 1 2016-01-07 2 3 #4 0 2016-01-12 2 3 #5 0 2016-01-16 2 3 #6 1 2016-01-20 2 2 #7 1 2016-01-20 1 1 #8 0 2016-01-25 1 1 #9 0 2016-01-26 1 1 #10 1 2016-01-31 0 0