信息处理 - 从频谱图中恢复相位 - 吾爱随笔录

在这个答案中，给出了一个迭代算法，用于从其频谱图中恢复信号，假设重叠窗口：

x_{n + 1} = istft (S \cdot \exp (i \cdot angle (stft (x_{n})))),

$x_{n+1} = \operatorname{istft}(S \cdot \exp(i \cdot \operatorname{angle}(\operatorname{stft}(x_n)))),$

在哪里 $S$ 是频谱图和 $\operatorname{(i)stft}$ 是（逆）短时傅里叶变换。我使用 NumPy、SciPy 和librosa进行了尝试：

import librosa
import numpy as np
import scipy

def mse(phase_pred, phase_true):
    '''
    Calculate the mean square error between the true phase and the
    predicted (reconstructed) phase.
    '''
    return np.mean(np.angle(phase_pred/phase_true)**2)

# Load an audio file and calculate STFT.
x, sample_rate = librosa.load('audio.wav', sr=44100)
D = librosa.stft(x)
mag, actual_phase = librosa.magphase(D)

# Try to reconstruct the phase using the iterative algorithm above.
phase = np.exp(1.j * np.random.uniform(0., 2*np.pi, size=actual_phase.shape))
x_ = librosa.istft(mag * phase)
print('iter {} mse {}'.format(-1, mse(phase, actual_phase)))
for i in range(100+1):
    _, phase = librosa.magphase(librosa.stft(x_))
    x_ = librosa.istft(mag * phase)
    print('iter {} mse {}'.format(i, mse(phase, actual_phase)))
    if i % 10 == 0:
        scipy.io.wavfile.write('recons{:05d}.wav'.format(i), 44100, x_)

据我所知，它没有收敛：

但是，第 100 次迭代后的音频听起来肯定比具有随机相位的音频更接近原始音频。

为什么这个算法没有收敛到正确的阶段？我误解了它的目的吗？