我有一些网络数据可以测量蜂窝网络中的噪声水平。在典型的桅杆上,通常有 3 个指向不同方向的扇区或天线。在这些天线之一内,可以有多个频率,它们都服务于大致相同的地理区域。
我有两周的 15 分钟数据,即 1343 次观察,我有网络中 12 个单元/扇区的数据。在这个数据集中,每个变量都有非常少的缺失值。
正如您从摘要中看到的那样,我在每个变量中都缺少很少的对象:
> str(wideRawDF)
'data.frame': 1343 obs. of 13 variables:
$ Period.Start.Time: POSIXlt, format: "2017-01-20 16:30:00" "2017-01-20 16:45:00" "2017-01-20 17:00:00" "2017-01-20 17:15:00" ...
$ DO0182U09A3 : num -102 -101 -101 -101 -101 ...
$ DO0182U09B3 : num -103.4 -102.8 -103.3 -95.9 -103 ...
$ DO0182U09C3 : num -103.9 -104.2 -103.9 -99.2 -104.1 ...
$ DO0182U21A1 : num -105 -105 -105 -104 -102 ...
$ DO0182U21A2 : num -105 -104 -105 -105 -105 ...
$ DO0182U21A3 : num -105 -105 -105 -105 -105 ...
$ DO0182U21B1 : num -102 -103 -104 -104 -104 ...
$ DO0182U21B2 : num -99.4 -102 -104 -101.4 -104.1 ...
$ DO0182U21B3 : num -104 -104 -104 -104 -104 ...
$ DO0182U21C1 : num -105 -105 -105 -104 -105 ...
$ DO0182U21C2 : num -104 -105 -105 -103 -105 ...
$ DO0182U21C3 : num -105 -105 -105 -105 -105 ...
> summary(wideRawDF)
Period.Start.Time DO0182U09A3 DO0182U09B3 DO0182U09C3 DO0182U21A1 DO0182U21A2
Min. :2017-01-20 16:30:00 Min. :-104.23 Min. :-105.90 Min. :-106.43 Min. :-106.16 Min. :-105.94
1st Qu.:2017-01-24 04:22:30 1st Qu.:-102.20 1st Qu.:-104.53 1st Qu.:-105.18 1st Qu.:-105.41 1st Qu.:-105.37
Median :2017-01-27 16:15:00 Median :-101.32 Median :-103.14 Median :-103.74 Median :-105.20 Median :-105.15
Mean :2017-01-27 16:15:00 Mean : -99.75 Mean :-102.21 Mean :-103.12 Mean :-105.00 Mean :-104.85
3rd Qu.:2017-01-31 04:07:30 3rd Qu.: -99.42 3rd Qu.:-101.21 3rd Qu.:-102.73 3rd Qu.:-104.89 3rd Qu.:-104.78
Max. :2017-02-03 16:00:00 Max. : -85.96 Max. : -69.96 Max. : -83.16 Max. : -88.01 Max. : -91.49
NA's :7 NA's :10 NA's :10 NA's :10 NA's :10
DO0182U21A3 DO0182U21B1 DO0182U21B2 DO0182U21B3 DO0182U21C1 DO0182U21C2 DO0182U21C3
Min. :-106.42 Min. :-105.40 Min. :-105.40 Min. :-105.45 Min. :-106.08 Min. :-106.45 Min. :-106.47
1st Qu.:-105.48 1st Qu.:-104.48 1st Qu.:-104.41 1st Qu.:-104.46 1st Qu.:-105.42 1st Qu.:-105.45 1st Qu.:-105.48
Median :-105.32 Median :-103.92 Median :-103.90 Median :-103.77 Median :-105.14 Median :-105.18 Median :-105.27
Mean :-105.06 Mean :-103.19 Mean :-103.09 Mean :-102.87 Mean :-104.96 Mean :-104.97 Mean :-105.08
3rd Qu.:-105.08 3rd Qu.:-102.73 3rd Qu.:-102.50 3rd Qu.:-101.53 3rd Qu.:-104.80 3rd Qu.:-104.87 3rd Qu.:-104.92
Max. : -89.24 Max. : -86.43 Max. : -81.07 Max. : -85.27 Max. : -93.65 Max. : -87.37 Max. : -86.89
NA's :10 NA's :3 NA's :3 NA's :3
作为我对此数据集分析的一部分,我陷入了数据插补的复杂性。我已经阅读了许多 Stack Overflow 和 Cross Validated 文章以及一些论文,但每次我看到一篇新论文时,我都会走上正轨。
我的数据不是正态分布的,事实上它是正确的,所以我不能使用 mtsdi EM 算法,因为它需要正态性。imputeTS 用于单变量时间数据,所以这对我也没有用。
我目前正在尝试找出MissMech包中 TestMCARNormality 函数的一个问题,我希望能够确认我的缺失是 MCAR,由于非正态性,我可以使用非参数方法进行估算。
什么原因会阻止我使用线性、样条或斯坦线插值来填充这些缺失值?

