我想找到最好的"R方式"来展平看起来像这样的数据帧:
CAT COUNT TREAT A 1,2,3 Treat-a, Treat-b B 4,5 Treat-c,Treat-d,Treat-e
所以它的结构如下:
CAT COUNT1 COUNT2 COUNT3 TREAT1 TREAT2 TREAT3 A 1 2 3 Treat-a Treat-b NA B 4 5 NA Treat-c Treat-d Treat-e
生成源数据帧的示例代码:
df<-data.frame(CAT=c("A","B")) df$COUNT <-list(1:3,4:5) df$TREAT <-list(paste("Treat-", letters[1:2],sep=""),paste("Treat-", letters[3:5],sep=""))
我相信我需要rbind和unlist的组合?任何帮助将不胜感激. - 蒂姆
这是一个使用基R的解决方案,接受列表中任意长度的向量,无需指定要折叠的数据帧的哪些列.部分解决方案是使用此答案生成的.
df2 <- do.call(cbind,lapply(df,function(x){ #check if it is a list, otherwise just return as is if(is.list(x)){ return(data.frame(t(sapply(x,'[',seq(max(sapply(x,length))))))) } else{ return(x) } }))
从R 3.2开始,也有lengths
替换sapply(x, length)
,
df3 <- do.call(cbind.data.frame, lapply(df, function(x) { # check if it is a list, otherwise just return as is if (is.list(x)) { data.frame(t(sapply(x,'[', seq(max(lengths(x)))))) } else { x } }))
使用的数据:
df <- structure(list(CAT = structure(1:2, .Label = c("A", "B"), class = "factor"), COUNT = list(1:3, 4:5), TREAT = list(c("Treat-a", "Treat-b" ), c("Treat-c", "Treat-d", "Treat-e"))), .Names = c("CAT", "COUNT", "TREAT"), row.names = c(NA, -2L), class = "data.frame")
这是基础r的另一种方式
df<-data.frame(CAT=c("A","B")) df$COUNT <-list(1:3,4:5) df$TREAT <-list(paste("Treat-", letters[1:2],sep=""),paste("Treat-", letters[3:5],sep=""))
创建一个帮助函数来完成工作
f <- function(l) { if (!is.list(l)) return(l) do.call('rbind', lapply(l, function(x) `length<-`(x, max(lengths(l))))) }
始终测试您的代码
f(df$TREAT) # [,1] [,2] [,3] # [1,] "Treat-a" "Treat-b" NA # [2,] "Treat-c" "Treat-d" "Treat-e"
应用它
df[] <- lapply(df, f) df # CAT COUNT.1 COUNT.2 COUNT.3 TREAT.1 TREAT.2 TREAT.3 # 1 A 1 2 3 Treat-a Treat-b# 2 B 4 5 NA Treat-c Treat-d Treat-e