我有一些关于航空公司航班的数据(在一个名为 的数据框中flights),我想看看飞行时间是否对严重延迟到达的概率(意味着 10 分钟或更长时间)有任何影响。我想我会使用逻辑回归,以飞行时间作为预测变量,以及每次飞行是否显着延迟(一堆伯努利斯)作为响应。我使用了以下代码...
flights$BigDelay <- flights$ArrDelay >= 10
delay.model <- glm(BigDelay ~ ArrDelay, data=flights, family=binomial(link="logit"))
summary(delay.model)
...但得到以下输出。
> flights$BigDelay <- flights$ArrDelay >= 10
> delay.model <- glm(BigDelay ~ ArrDelay, data=flights, family=binomial(link="logit"))
Warning messages:
1: In glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart,  :
  algorithm did not converge
2: In glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart,  :
  fitted probabilities numerically 0 or 1 occurred
> summary(delay.model)
Call:
glm(formula = BigDelay ~ ArrDelay, family = binomial(link = "logit"),
    data = flights)
Deviance Residuals:
       Min          1Q      Median          3Q         Max
-3.843e-04  -2.107e-08  -2.107e-08   2.107e-08   3.814e-04
Coefficients:
            Estimate Std. Error z value Pr(>|z|)
(Intercept)  -312.14     170.26  -1.833   0.0668 .
ArrDelay       32.86      17.92   1.833   0.0668 .
---
Signif. codes:  0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1
(Dispersion parameter for binomial family taken to be 1)
    Null deviance: 2.8375e+06  on 2291292  degrees of freedom
Residual deviance: 9.1675e-03  on 2291291  degrees of freedom
AIC: 4.0092
Number of Fisher Scoring iterations: 25
算法没有收敛是什么意思?我认为这是因为BigDelay值是TRUEandFALSE而不是0and 1,但是在转换所有内容后我得到了同样的错误。有任何想法吗?