当前位置:  开发笔记 > 编程语言 > 正文

为什么dplyr删除的值不符合条件?

如何解决《为什么dplyr删除的值不符合条件?》经验,为你挑选了0个好方法。

我使用的dplyr替换valueNA,如果条件满足,但它把NA在地方,它不应该.

dput:

df <- structure(list(id = c("USC00231275", "USC00231275", "USC00231275", 
"USC00231275", "USC00231275", "USC00231275", "USC00231275", "USC00231275", 
"USC00231275", "USC00231275"), element = c("TMAX", "TMIN", "TMAX", 
"TMIN", "TMAX", "TMIN", "TMAX", "TMIN", "TMAX", "TMIN"), year = c(1937, 
1937, 1937, 1937, 1937, 1937, 1937, 1937, 1937, 1937), month = c(5, 
5, 5, 5, 5, 5, 5, 5, 5, 5), day = c(1, 1, 2, 2, 3, 3, 4, 4, 5, 
5), date = structure(c(-11933, -11933, -11932, -11932, -11931, 
-11931, -11930, -11930, -11929, -11929), class = "Date"), value = c(0, 
53.96, 68, 44.96, 62.06, 53.96, 73.04, 53.96, 69.08, 50)), .Names = c("id", 
"element", "year", "month", "day", "date", "value"), row.names = c(NA, 
10L), class = "data.frame")

data.frame (注意:条件仅在第1行和第2行满足)

            id element year month day       date value
1  USC00231275    TMAX 1937     5   1 1937-05-01  0.00
2  USC00231275    TMIN 1937     5   1 1937-05-01 53.96
3  USC00231275    TMAX 1937     5   2 1937-05-02 68.00
4  USC00231275    TMIN 1937     5   2 1937-05-02 44.96
5  USC00231275    TMAX 1937     5   3 1937-05-03 62.06
6  USC00231275    TMIN 1937     5   3 1937-05-03 53.96
7  USC00231275    TMAX 1937     5   4 1937-05-04 73.04
8  USC00231275    TMIN 1937     5   4 1937-05-04 53.96
9  USC00231275    TMAX 1937     5   5 1937-05-05 69.08
10 USC00231275    TMIN 1937     5   5 1937-05-05 50.00

dplyr

df %>%
  group_by(date) %>%
  mutate(
    value = if(value[element == 'TMIN'] >= value[element == 'TMAX'])
      as.numeric(NA) else value
  )

            id element  year month   day       date value
         (chr)   (chr) (dbl) (dbl) (dbl)     (date) (dbl)
1  USC00231275    TMAX  1937     5     1 1937-05-01    NA
2  USC00231275    TMIN  1937     5     1 1937-05-01    NA
3  USC00231275    TMAX  1937     5     2 1937-05-02 68.00
4  USC00231275    TMIN  1937     5     2 1937-05-02 44.96
5  USC00231275    TMAX  1937     5     3 1937-05-03    NA
6  USC00231275    TMIN  1937     5     3 1937-05-03    NA
7  USC00231275    TMAX  1937     5     4 1937-05-04 73.04
8  USC00231275    TMIN  1937     5     4 1937-05-04 53.96
9  USC00231275    TMAX  1937     5     5 1937-05-05 69.08
10 USC00231275    TMIN  1937     5     5 1937-05-05 50.00

请注意,应该更改的唯一行是12,但dplyr更改了行5,6即使条件未满足.

推荐阅读
wurtjq
这个屌丝很懒,什么也没留下!
DevBox开发工具箱 | 专业的在线开发工具网站    京公网安备 11010802040832号  |  京ICP备19059560号-6
Copyright © 1998 - 2020 DevBox.CN. All Rights Reserved devBox.cn 开发工具箱 版权所有