我正在尝试创建一个图来比较不同算法的运行时间.通过运行以下R代码,我得到以下情节,我通常对此表示满意.但是:从该图中读取值可能很困难,是否有办法为每个实例获取每个DBMS的绘制平均值?例如gplus-combined
,值CacheDBMS
大约为50,而BranchDBMS
大约为200.
ggplot(dt, aes(reorder(instance, V9), V9)) + geom_point(aes(group=V2, colour=V2), stat='summary', fun.y='mean') + geom_line(aes(group=V2, colour=V2), stat='summary', fun.y='mean') + scale_y_log10() + ylab("Mean wall time") + xlab("") + ggtitle("Comparison of Database Management Systems") + theme_bw() + theme(axis.text.x = element_text(angle=45, vjust = 1, hjust = 1)) + guides(color=guide_legend(title="DBMS"))
我想要每个点的y值.优选作为表格,例如
BranchDBMS gplus-combined 213.21 CacheDBMS gplus-combined 48.68
编辑
输入数据的小片段(超出10000行).我删除了未使用的列,因此V*
不正确.但这V2
是第一列,V9
是第二列,也是instance
最后一列.
BranchDBMS; 0.163352; facebook-combined BranchDBMS; 0.169043; facebook-combined BranchDBMS; 0.162545; facebook-combined BranchDBMS; 0.159489; facebook-combined BranchDBMS; 0.168414; facebook-combined CacheDBMS ; 0.038515; facebook-combined CacheDBMS ; 0.037179; facebook-combined CacheDBMS ; 0.037385; facebook-combined CacheDBMS ; 0.036514; facebook-combined BranchDBMS; 281.149423; gplus-combined BranchDBMS; 261.093502; gplus-combined BranchDBMS; 258.816546; gplus-combined CacheDBMS ; 22.442501; gplus-combined CacheDBMS ; 22.377717; gplus-combined CacheDBMS ; 22.469739; gplus-combined CacheDBMS ; 22.451922; gplus-combined
eipi10.. 5
以下是使用内置iris
数据框直接将值标签添加到图表的示例:
p1 = ggplot(iris, aes(Sepal.Width, Sepal.Length, colour=Species)) + stat_summary(fun.y=mean, geom="line", alpha=0.5) + stat_summary(fun.y=mean, geom="text", aes(label=sprintf("%1.1f", ..y..)), size=3, show.legend=FALSE) + guides(colour=guide_legend(override.aes = list(alpha=1, lwd=1)))
..y..
处于的每个值的内部计算装置Sepal.Width
为每个Species
.因为我们用于alpha=0.5
行geom,override.aes
允许我们在图例中使用更大胆的线条.
添加数据值表的一种方法如下:
library(gridExtra) library(dplyr) # Change default fontsize for the data table mytheme <- ttheme_default( core = list(fg_params=list(cex = 0.7)), colhead = list(fg_params=list(cex = 0.75)), rowhead = list(fg_params=list(cex = 0.75))) # Create table (in this case I just show the first three values for each species) tab = tableGrob(iris %>% group_by(Species, Sepal.Width) %>% summarise(`Mean Sepal Length`=sprintf("%1.1f", mean(Sepal.Length))) %>% slice(1:3), theme=mytheme, rows=NULL) # Lay out graph and table grid.arrange(p1, tab, ncol=1)
以下是使用内置iris
数据框直接将值标签添加到图表的示例:
p1 = ggplot(iris, aes(Sepal.Width, Sepal.Length, colour=Species)) + stat_summary(fun.y=mean, geom="line", alpha=0.5) + stat_summary(fun.y=mean, geom="text", aes(label=sprintf("%1.1f", ..y..)), size=3, show.legend=FALSE) + guides(colour=guide_legend(override.aes = list(alpha=1, lwd=1)))
..y..
处于的每个值的内部计算装置Sepal.Width
为每个Species
.因为我们用于alpha=0.5
行geom,override.aes
允许我们在图例中使用更大胆的线条.
添加数据值表的一种方法如下:
library(gridExtra) library(dplyr) # Change default fontsize for the data table mytheme <- ttheme_default( core = list(fg_params=list(cex = 0.7)), colhead = list(fg_params=list(cex = 0.75)), rowhead = list(fg_params=list(cex = 0.75))) # Create table (in this case I just show the first three values for each species) tab = tableGrob(iris %>% group_by(Species, Sepal.Width) %>% summarise(`Mean Sepal Length`=sprintf("%1.1f", mean(Sepal.Length))) %>% slice(1:3), theme=mytheme, rows=NULL) # Lay out graph and table grid.arrange(p1, tab, ncol=1)