我有一个频率= 7的时间序列数据如下:
combo_1_daily_mini <- read.table(header=TRUE, text=" region_1 region_2 region_3 date incidents USA CA San Francisco 1/1/15 37 USA CA San Francisco 1/2/15 30 USA CA San Francisco 1/3/15 31 USA CA San Francisco 1/4/15 33 USA CA San Francisco 1/5/15 28 USA CA San Francisco 1/6/15 33 USA CA San Francisco 1/7/15 39 USA PA Pittsburg 1/1/15 38 USA PA Pittsburg 1/2/15 35 USA PA Pittsburg 1/3/15 37 USA PA Pittsburg 1/4/15 33 USA PA Pittsburg 1/5/15 30 USA PA Pittsburg 1/6/15 33 USA PA Pittsburg 1/7/15 25 Greece Macedonia Skopje 1/1/15 29 Greece Macedonia Skopje 1/2/15 37 Greece Macedonia Skopje 1/3/15 28 Greece Macedonia Skopje 1/4/15 38 Greece Macedonia Skopje 1/5/15 27 Greece Macedonia Skopje 1/6/15 38 Greece Macedonia Skopje 1/7/15 39 Italy Trentino Trento 1/1/15 35 Italy Trentino Trento 1/2/15 31 Italy Trentino Trento 1/3/15 34 Italy Trentino Trento 1/4/15 34 Italy Trentino Trento 1/5/15 26 Italy Trentino Trento 1/6/15 33 Italy Trentino Trento 1/7/15 27 ", sep = "\t") dput(trst, control = "all") structure(list(region_1 = structure(c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Greece", "Italy", "USA"), class = "factor"), region_2 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("CA", "Macedonia", "PA", "Trentino" ), class = "factor"), region_3 = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("Pittsburg", "San Francisco", "Skopje", "Trento"), class = "factor"), date = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L), .Label = c("1/1/15", "1/2/15", "1/3/15", "1/4/15", "1/5/15", "1/6/15", "1/7/15"), class = "factor"), incidents = c(37L, 30L, 31L, 33L, 28L, 33L, 39L, 38L, 35L, 37L, 33L, 30L, 33L, 25L, 29L, 37L, 28L, 38L, 27L, 38L, 39L, 35L, 31L, 34L, 34L, 26L, 33L, 27L)), .Names = c("region_1", "region_2", "region_3", "date", "incidents"), class = "data.frame", row.names = c(NA, -28L))
region_1,region_2,region_3的每一组都有自己的季节性和趋势.
我试图根据历史数据预测下一周的事故数量.我有2015年1月1日至2015年6月30日为32个不同国家的6个月历史数据.每个国家/地区都有很多region_2和region_3.我总共有32,356个独特的region_1,region_2,region_3时间序列.
我有2个问题/问题:
问题 - 我面临的问题是当我在by()函数中应用Holt Winters时,我收到警告并且我无法理解它们.任何帮助理解它们都非常有帮助
以下是我的代码:
ts_fun <- function(x){ ts_y <- ts(x, frequency = 7) } hw_fun <- function(x){ ts_y <- ts_fun(x) ts_h <- HoltWinters(ts_y) } combo_1_daily_mini$region_1 <- as.factor(combo_1_daily_mini$region_1) combo_1_daily_mini$region_2 <- as.factor(combo_1_daily_mini$region_2) combo_1_daily_mini$region_3 <- as.factor(combo_1_daily_mini$region_3) combo_1_ts <- by(combo_1_daily_mini,list(combo_1_daily_mini$region_1, combo_1_daily_mini$region_2, combo_1_daily_mini$region_3 ),ts_fun) combo_1_hw <- by(combo_1_daily_mini,list(combo_1_daily_mini$region_1, combo_1_daily_mini$region_2, combo_1_daily_mini$region_3 ),hw_fun)
警告信息:
1: In HoltWinters(ts_y) : optimization difficulties: ERROR: ABNORMAL_TERMINATION_IN_LNSRCH 2: In HoltWinters(ts_y) : optimization difficulties: ERROR: ABNORMAL_TERMINATION_IN_LNSRCH 3: In HoltWinters(ts_y) : optimization difficulties: ERROR: ABNORMAL_TERMINATION_IN_LNSRCH 4: In HoltWinters(ts_y) : optimization difficulties: ERROR: ABNORMAL_TERMINATION_IN_LNSRCH
问题 - 我通过多列正确应用函数的方式是什么?有没有更好的办法?我基本上希望通过region_1,region_2,region_3获得下周的预测数字.我计划使用以下代码:
nw_forecast < - forecast(combo_1_hw,7)
我可以应用Holt Winters功能,还可以预测每个region_1,region_2,region_3组合创建时间序列数据的时间.此方法不可行,因为我的数据集中有32,356个唯一组合.
任何帮助表示赞赏谢谢