当前位置:  开发笔记 > 编程语言 > 正文

在堆积条形图中的不同元素之间绘制线条

如何解决《在堆积条形图中的不同元素之间绘制线条》经验,为你挑选了2个好方法。

我正在尝试在ggplot2中的两个单独的堆叠条形图(相同的图形)之间绘制线条,以显示第二个条形图的两个部分是第一个条形图的子集.

我曾经尝试都geom_linegeom_segment.但是,我遇到了同样的问题geom,即在同一个图中为每个(需要两行)指定一个单独的开始和停止,而不是有五行的数据帧.

没有线的图的示例代码:

library(data.table)
Example <- data.table(X_Axis = c('Count', 'Count', 'Dollars', 'Dollars', 'Dollars'),
                  Stack_Group = c('Purely A', 'A & B', 'Purely A Dollars', 'B Mixed Dollars', 'A Mixed dollars'),
                  Value = c(10,3, 120000, 100000, 50000))
Example[, Percent := Value/sum(Value), by = X_Axis]


ggplot(Example, aes(x = X_Axis, y = Percent, fill = factor(Stack_Group))) +
  geom_bar(stat = 'identity', width = 0.5) + 
  scale_y_continuous(labels = scales::percent)

最终情节的目标: 在此输入图像描述



1> Henrik..:

您可以从绘图对象中获取此数据,而不是对段的开始和结束位置进行硬编码.这里有一个替代方案,您可以在其中提供x类别和条形元素的名称,在这些元素之间应绘制线条.

将绘图分配给变量:

p <- ggplot() +
  geom_bar(data = Example,
           aes(x = X_Axis, y = Percent, fill = Stack_Group), stat = 'identity', width = 0.5)

从绘图对象(ggplot_build)中获取数据.转换为data.table(setDT):

d <- ggplot_build(p)$data[[1]]
setDT(d)

在绘图对象的数据中,'x'和'group'变量不是由它们的名称明确给出的,而是作为数字给出的.因为分类变量是按字典顺序排列的ggplot,所以我们可以rank在每个'x'内匹配数字和它们的名称:

d[ , r := rank(group), by = x]

Example[ , x := .GRP, by = X_Axis]
Example[ , r := rank(Stack_Group), by = x]

加入以从原始数据添加'X_Axis'和'Stack_Group'的名称到绘图数据:

d <- d[Example[ , .(X_Axis, Stack_Group, x, r)], on = .(x, r)]

设置应在其中绘制线条的x类别和条形元素的名称:

x_start_nm <- "Count"
x_end_nm <- "Dollars"

e_start <- "A & B"
e_upper <- "A Mixed dollars"
e_lower <- "B Mixed Dollars"

选择绘图对象的相关部分以创建线的开始/结束数据:

d2 <- data.table(x_start = rep(d[X_Axis == x_start_nm & Stack_Group == e_start, xmax], 2),
                 y_start = d[X_Axis == x_start_nm & Stack_Group == e_start, c(ymax, ymin)],
                 x_end = rep(d[X_Axis == x_end_nm & Stack_Group == e_upper, xmin], 2),
                 y_end = c(d[X_Axis == x_end_nm & Stack_Group == e_upper, ymax],
                           d[X_Axis == x_end_nm & Stack_Group == e_lower, ymin]))

将线段添加到原始图:

p + 
  geom_segment(data = d2, aes(x = x_start, xend = x_end, y = y_start, yend = y_end))

在此输入图像描述



2> Uwe..:

Here is another flexible and straightforward approach which is somewhat similar to @Henrik's answer but is working solely with user data. There is no need to extract data from a ggplot_build() object.

Preparing the data

Code:

library(data.table)
library(forcats)

Example <- data.table(
  X_Axis = fct_inorder(c("Count", "Count", "Dollars", "Dollars", "Dollars")),
  Stack_Group = fct_rev(fct_inorder(c("Purely A", "A & B", "Purely A Dollars", 
                                      "B Mixed Dollars", "A Mixed dollars"))),
  Value = c(10, 3, 120000, 100000, 50000),
  Grp2 = fct_inorder(c("Purely", "Mixed", "Purely", "Mixed", "Mixed"))
  )
Example[, Percent := Value/sum(Value), by = X_Axis]
Example[order(Grp2, -Stack_Group), Cumulated := cumsum(Percent), by = X_Axis]

Prepared data:

Example
#    X_Axis      Stack_Group  Value   Grp2   Percent Cumulated
#1:   Count         Purely A     10 Purely 0.7692308 0.7692308
#2:   Count            A & B      3  Mixed 0.2307692 1.0000000
#3: Dollars Purely A Dollars 120000 Purely 0.4444444 0.4444444
#4: Dollars  B Mixed Dollars 100000  Mixed 0.3703704 0.8148148
#5: Dollars  A Mixed dollars  50000  Mixed 0.1851852 1.0000000

Plotting

Code:

library(ggplot2)
w = 0.4   # width of bars
ggplot(Example, aes(x = X_Axis, y = Percent, fill = Stack_Group)) +
  geom_col(width = w) +
  geom_line(aes(x = (1 - w) * as.numeric(X_Axis) + 1.5 * w, y = Top, group = Grp2), 
            data = Example[, .(Top = max(Cumulated)), by = .(X_Axis, Grp2)],
            inherit.aes = FALSE) +
  scale_y_continuous(labels = scales::percent)

Chart:

Explanation

ggplot implicitely coerces character variables to factor which controls the order in which items are plotted. By default, the order of levels in a factor is alphabetically. But here we do need to control the plot order explicitely. Therefore, we create factors with a specified order of levels with help of Hadley's handy forcats package.

The order of levels in Stack_Group is reversed to be in line with the order ggplot2 (version 2.2.0+) is stacking values (see ?position_stack).

The data include two types of groups:

One is along the X_Axis distinguishing between "Count" and "Dollars".

The other one is hidden in Stack_Group, the names of data items, and the way the OP wants to have the line segments drawn. Here, we explicitely define a new variable Grp2 which distinguishes between "Purely" at the bottom of each bar and "Mixed" at the top of each bar. This avoids to hard-code the start and end points of the line segments making this solution more flexible.

The cumulative percentages are computed for each bar. These are needed later for drawing the line segments.

The width of the bar is defined in variable w and passed to the width parameter of geom_col().

Introduced with version 2.2.0 of ggplot2, geom_col() is a shortcut for geom_bar(stat = "identity").

As there are only two bars, geom_lines() is used to draw the line segments between them.

On the x-axis, the line segments range from x = 1 + w / 2 to x = 2 - w / 2. Here, we use the fact that ggplot is using the integer numbers of the factor levels for plotting. So, "Count" is plotted on x = 1 and "Dollar" on x = 2. (This is why the factor levels had been defined explicitely.)

The y values for each bar are taken from the maximum values Top of the cumulated percentages in each Grp2 which are computed by Example[, .(Top = max(Cumulated)), by = .(X_Axis, Grp2)]. This allows for modifying names and order of data items within each Grp2.

The parameter inherit.aes = FALSE is required to prevent ggplot from expecting a value for the fill aesthetic.

Enhancement

If required, Grp2 could be visualised easily using a different line type:

w = 0.2   # width of bars
ggplot(Example, aes(x = X_Axis, y = Percent, fill = Stack_Group)) +
  geom_col(width = w) +
  geom_line(aes(x = (1 - w) * as.numeric(X_Axis) + 1.5 * w, y = Top, 
                group = Grp2, linetype = fct_rev(Grp2)), 
            data = Example[, .(Top = max(Cumulated)), by = .(X_Axis, Grp2)],
            inherit.aes = FALSE) +
  scale_y_continuous(labels = scales::percent) + 
  labs(linetype = "Purely vs Mixed")

Now, the factors of Grp 2 are displayed in the legend. The title in the legend has been renamed conveniently using labs(). The order of factors in Grp2 has been reversed to have the solid line at 100% and to show the factors in the legend as they are stacked in the chart ("Purely" at the bottom, "Mixed" above).

Note that also the width parameter w was changed for demonstration purposes.

推荐阅读
wangtao
这个屌丝很懒,什么也没留下!
DevBox开发工具箱 | 专业的在线开发工具网站    京公网安备 11010802040832号  |  京ICP备19059560号-6
Copyright © 1998 - 2020 DevBox.CN. All Rights Reserved devBox.cn 开发工具箱 版权所有