当前位置:  开发笔记 > 编程语言 > 正文

使用数据的最接近值。table roll ='nearest'

如何解决《使用数据的最接近值。tableroll='nearest'》经验,为你挑选了0个好方法。

我希望使用包data.table和参数roll ='nearest'在'date'列上找到最接近的匹配项。我首先在另一列(字母)上进行匹配:

set.seed(1)
A <- data.table( dates.A = seq.Date(as.Date('2008-01-01'),as.Date('2008-01-31'), by = '3 days'), 
                 letters.A = LETTERS[1:4] , value.A = runif(4) )

B <- data.table( date.B = seq.Date(as.Date('2008-01-01'),as.Date('2008-01-05'), by = 'days'), 
                 letters.B = LETTERS[1:4] , value.B = runif(4) )

#### Define the columns I merge on

A[, ':=' (dates.merge = dates.A, letters.merge = letters.A)]
B[, ':=' (dates.merge = date.B, letters.merge = letters.B)]
setkeyv(A, c('letters.merge','dates.merge'))
setkeyv(B, c('letters.merge','dates.merge'))

result <- B[A, roll = 'nearest']

#### As a side note, how do I avoid the change in order of my data.tables??
setorder(result,dates.A,letters.A)
setorder(A,dates.A)
setorder(B,date.B)

结果A和B的输出如下所示:

> result
        date.B letters.B   value.B dates.merge letters.merge    dates.A letters.A   value.A
 1: 2008-01-01         A 0.2016819  2008-01-01             A 2008-01-01         A 0.2655087
 2: 2008-01-02         B 0.8983897  2008-01-04             B 2008-01-04         B 0.3721239
 3: 2008-01-03         C 0.9446753  2008-01-07             C 2008-01-07         C 0.5728534
 4: 2008-01-04         D 0.6607978  2008-01-10             D 2008-01-10         D 0.9082078
 5: 2008-01-05         A 0.2016819  2008-01-13             A 2008-01-13         A 0.2655087
 6: 2008-01-02         B 0.8983897  2008-01-16             B 2008-01-16         B 0.3721239
 7: 2008-01-03         C 0.9446753  2008-01-19             C 2008-01-19         C 0.5728534
 8: 2008-01-04         D 0.6607978  2008-01-22             D 2008-01-22         D 0.9082078
 9: 2008-01-05         A 0.2016819  2008-01-25             A 2008-01-25         A 0.2655087
10: 2008-01-02         B 0.8983897  2008-01-28             B 2008-01-28         B 0.3721239
11: 2008-01-03         C 0.9446753  2008-01-31             C 2008-01-31         C 0.5728534
> A
       dates.A letters.A   value.A dates.merge letters.merge
 1: 2008-01-01         A 0.2655087  2008-01-01             A
 2: 2008-01-04         B 0.3721239  2008-01-04             B
 3: 2008-01-07         C 0.5728534  2008-01-07             C
 4: 2008-01-10         D 0.9082078  2008-01-10             D
 5: 2008-01-13         A 0.2655087  2008-01-13             A
 6: 2008-01-16         B 0.3721239  2008-01-16             B
 7: 2008-01-19         C 0.5728534  2008-01-19             C
 8: 2008-01-22         D 0.9082078  2008-01-22             D
 9: 2008-01-25         A 0.2655087  2008-01-25             A
10: 2008-01-28         B 0.3721239  2008-01-28             B
11: 2008-01-31         C 0.5728534  2008-01-31             C
> B
       date.B letters.B   value.B dates.merge letters.merge
1: 2008-01-01         A 0.2016819  2008-01-01             A
2: 2008-01-02         B 0.8983897  2008-01-02             B
3: 2008-01-03         C 0.9446753  2008-01-03             C
4: 2008-01-04         D 0.6607978  2008-01-04             D
5: 2008-01-05         A 0.2016819  2008-01-05             A

但是,请注意,日期与日期之间最接近的日期。A“ 2008-01-07”应为“ 2008-01-05”(见B),而不是“ 2008-01-03”。对于date.A结果中“ 2008-01-07”以下的所有日期,也是如此。

我在这里做错了什么?

推荐阅读
围脖上的博博_771
这个屌丝很懒,什么也没留下!
DevBox开发工具箱 | 专业的在线开发工具网站    京公网安备 11010802040832号  |  京ICP备19059560号-6
Copyright © 1998 - 2020 DevBox.CN. All Rights Reserved devBox.cn 开发工具箱 版权所有