基本思想是“二项分布的成功概率”*“泊松分布的lambda(=期望值)”。但是你必须考虑到计数部分的泊松模型永远不会返回 0。
我假设您的示例数据并预测何时width是c(23, 26, 29)。
library(pscl)
data <- data.frame(y = c(8, 0, 3, 7), width = c(34.40, 22.50, 28.34, 32.22))
model <- hurdle(y ~ width, data = data)
model
# Count model coefficients (truncated poisson with log link):
# (Intercept) width
# -3.4520 0.1629 # model$coef$count[[1]] (left); model$coef$count[[2]] (right)
# Zero hurdle model coefficients (binomial with logit link):
# (Intercept) width
# -198.851 7.809 # model$coef$zero[[1]] (left); model$coef$zero[[2]] (right)
首先,您计算二项式模型的成功概率phi_zero, (逻辑方程)。(由于使用它,我附上了一个记录版本predict()。)
phi_zero <- 1 / ( 1 + exp(-(model$coef$zero[[1]] + model$coef$zero[[2]] * c(23, 26, 29))))
# p0_zero <- log(phi_zero)
其次,您计算泊松模型的参数(= 期望值)mu,以及 > 0 的概率,phi_count。
mu <- exp(model$coef$count[[1]] + model$coef$count[[2]] * c(23, 26, 29))
phi_count <- ppois(0, lambda = mu, lower.tail = F) # not 0 probability
# p0_count <- ppois(0, lambda = mu, lower.tail = F, log.p = T)
最后,整合两个模型的值。
phi <- phi_zero / phi_count # because there isn't 0 coming from poisson.
rval <- phi * mu
# logphi <- p0_zero - p0_count
# rval2 <- exp(logphi + log(mu))
rval
# [1] 8.051324e-09 2.429582e+00 3.674317e+00
并将predict()类hurdle作为参数(参见 参考资料?predict.hurdle)。
predict(model, data.frame(width = c(23, 26, 29)), type = "response")
# 1 2 3
# 8.051324e-09 2.429582e+00 3.674317e+00 # the same