当前位置:  开发笔记 > 编程语言 > 正文

通过计数将长格式转换为宽格式的简便方法

如何解决《通过计数将长格式转换为宽格式的简便方法》经验,为你挑选了3个好方法。

我有以下数据集:

sample.data <- data.frame(Step = c(1,2,3,4,1,2,1,2,3,1,1),
                          Case = c(1,1,1,1,2,2,3,3,3,4,5),
                          Decision = c("Referred","Referred","Referred","Approved","Referred","Declined","Referred","Referred","Declined","Approved","Declined"))

sample.data

   Step Case Decision
1     1    1 Referred
2     2    1 Referred
3     3    1 Referred
4     4    1 Approved
5     1    2 Referred
6     2    2 Declined
7     1    3 Referred
8     2    3 Referred
9     3    3 Declined
10    1    4 Approved
11    1    5 Declined

是否有可能在R中将其转换为宽表格格式,并在标题上做出决定,每个单元格的值都是事件的计数,例如:

Case    Referred    Approved    Declined
1          3           1            0
2          1           0            1
3          2           0            1
4          0           1            0
5          0           0            1

Jaap.. 13

-package dcast函数中的聚合参数reshape2默认为length(= count).在data.table-package中,dcast实现了该功能的改进版本.所以在你的情况下,这将是:

library('reshape2') # or library('data.table')
newdf <- dcast(sample.data, Case ~ Decision)

或明确使用参数:

newdf <- dcast(sample.data, Case ~ Decision,
               value.var = "Decision", fun.aggregate = length)

这给出了以下数据帧:

> newdf
  Case Approved Declined Referred
1    1        1        0        3
2    2        0        1        1
3    3        0        1        2
4    4        1        0        0
5    5        0        1        0

如果未指定聚合函数,则会收到警告,告知您将dcastlenght用作默认值.



1> Jaap..:

-package dcast函数中的聚合参数reshape2默认为length(= count).在data.table-package中,dcast实现了该功能的改进版本.所以在你的情况下,这将是:

library('reshape2') # or library('data.table')
newdf <- dcast(sample.data, Case ~ Decision)

或明确使用参数:

newdf <- dcast(sample.data, Case ~ Decision,
               value.var = "Decision", fun.aggregate = length)

这给出了以下数据帧:

> newdf
  Case Approved Declined Referred
1    1        1        0        3
2    2        0        1        1
3    3        0        1        2
4    4        1        0        0
5    5        0        1        0

如果未指定聚合函数,则会收到警告,告知您将dcastlenght用作默认值.



2> TARehman..:

您可以通过简单的table()声明来完成此任务.您可以使用设置因子级别来获得您想要的响应.

sample.data$Decision <- factor(x = sample.data$Decision,
                               levels = c("Referred","Approved","Declined"))

table(Case = sample.data$Case,sample.data$Decision)

Case Referred Approved Declined
   1        3        1        0
   2        1        0        1
   3        2        0        1
   4        0        1        0
   5        0        0        1



3> Tyler Rinker..:

这是一个dplyr + tidyr方法:

if (!require("pacman")) install.packages("pacman")
pacman::p_load(dplyr, tidyr)

sample.data %>%
    count(Case, Decision) %>%
    spread(Decision, n, fill = 0)

##    Case Approved Declined Referred
##   (dbl)    (dbl)    (dbl)    (dbl)
## 1     1        1        0        3
## 2     2        0        1        1
## 3     3        0        1        2
## 4     4        1        0        0
## 5     5        0        1        0

推荐阅读
依然-狠幸福
这个屌丝很懒,什么也没留下!
DevBox开发工具箱 | 专业的在线开发工具网站    京公网安备 11010802040832号  |  京ICP备19059560号-6
Copyright © 1998 - 2020 DevBox.CN. All Rights Reserved devBox.cn 开发工具箱 版权所有