Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
327 views
in Technique[技术] by (71.8m points)

r - plotting 2 regression lines on time series data before and after a time period

I have some time series data which looks like:

   date                totalDaily after_15Oct
   <dttm>                   <dbl>       <dbl>
 1 2020-09-01 00:00:00  14298554.           0
 2 2020-09-02 00:00:00  14427924.           0
 3 2020-09-03 00:00:00  14594507.           0
 4 2020-09-04 00:00:00  14792610.           0

Where after_15Oct takes on a value of 0 or 1.

When I plot the data it looks like:

enter image description here

I want to compute a regression and fit the fitted values such that the plot will look similar to:

enter image description here

I can run the regression using:

mods <- df %>% 
  group_by(after_15Oct) %>% 
  summarise(mod = list(lm(totalDaily ~ date)))

Which gives me 2 regressions, one before 15th Oct and one after.

I can then filter the data and make predictions for the after 15th Oct.

after15OctData <- df %>% 
  filter(after_15Oct == 1)

pred <- predict(mods$mod[[1]], type = "response", data = after15OctData)

pred <- (c(rep(c(NA),  c(33)), pred))
fittedValues <- (c(mods$mod[[1]]$fitted.values, rep(c(NA), c(33))))
fittedValues2 <- c(rep(c(NA), c(32)), mods$mod[[2]]$fitted.values)

df %>% 
  cbind(pred) %>% 
  ggplot(aes(x = date)) +
  geom_point(aes(y = totalDaily)) +
  geom_line(aes(y = pred), color = "black") +
  geom_line(aes(y = fittedValues), color = "orange") +
  geom_line(aes(y = fittedValues2, color = "blue"))

Which gives me:

enter image description here

However, this does not join the lines up. I was wondering if there was a more "tidy" way also of obtaining the result. So my question is, how can I join the 2 series line up - such that it looks similar to the example picture? Also, what can I do to clean the code up a little since using rep(c(32... etc. does not generalise onto other time periods (i.e. I have to keep changing the numbers inside the rep()).

Data:

df <- structure(list(date = structure(c(1598918400, 1599004800, 1599091200, 
1599177600, 1599436800, 1599523200, 1599609600, 1599696000, 1599782400, 
1600041600, 1600128000, 1600214400, 1600300800, 1600387200, 1600646400, 
1600732800, 1600819200, 1600905600, 1600992000, 1601251200, 1601337600, 
1601424000, 1601510400, 1601596800, 1601856000, 1601942400, 1602028800, 
1602115200, 1602201600, 1602460800, 1602547200, 1602633600, 1602720000, 
1602806400, 1603065600, 1603152000, 1603238400, 1603324800, 1603411200, 
1603670400, 1603756800, 1603843200, 1603929600, 1604016000, 1604275200, 
1604361600, 1604448000, 1604534400, 1604620800, 1604880000, 1604966400, 
1605052800, 1605139200, 1605225600, 1605484800, 1605571200, 1605657600, 
1605744000, 1605830400, 1606089600, 1606176000, 1606262400, 1606348800, 
1606435200, 1606694400), tzone = "UTC", class = c("POSIXct", 
"POSIXt")), totalDaily = c(14298554.063, 14427923.715, 14594506.597, 
14792610.38, 14176724.039, 14597634.909, 14398503.951, 14720623.375, 
10741599.683, 15321082.804, 15327286.026, 15285290.512, 15385702.052, 
15472501.992, 14824282.66, 15373475.777, 15432865.492, 13987846.969, 
15030238.471, 14982987.831, 15068473.939, 15420153.347, 15467401.077, 
15222054.193, 15115719.969, 15068013.058, 15036988.418, 15203030.521, 
15388979.495, 10272544.304, 14813377.192, 14744991.315, 14975803.7991768, 
14536858.289, 13863147.481, 14086232.611, 14174599.012, 14071285.48, 
14403303.75, 13654238.548, 13929438.016, 13953753.186, 14217652.491, 
12738872.418, 13315829.399, 13620749.604, 13665241.814, 13965383.623, 
12835957.62, 13550407.198, 13465093.039, 13697991.699, 13862048.141, 
13193641.13, 13863001.871, 14104868.109, 14077306.804, 14378736.115, 
13431275.993, 14499347.856, 14741269.274, 14573735.082, 14401326.933, 
13689721.344, 14362745.589), after_15Oct = c(0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)), row.names = c(NA, 
-65L), class = c("tbl_df", "tbl", "data.frame"))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
等待大神答复

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...