We consider the inverse problem of restoring recorded audio signals that are corrupted by a combination of unknown degradations. Direct inversion is ill-posed since many audio samples can explain the same degraded measurements. To overcome this, we adopt a diffusion-based posterior sampler to generate audio that is consistent with the degraded recordings. While active research is in progress on generative inverse solvers, we find that the fully blind nature our problem poses new challenges in deriving a tractable likelihood score. We break-away from existing approaches, which either estimate or partially-approximate the forward operator, and instead reformulate the likelihood score in an embedding space learned via contrastive training. By noting that a surrogate form of the likelihood score in this embedding space is a valid approximation of the true likelihood score, we show that it possible to steer the denoising process towards the posterior. We perform experiments on historical piano recordings and show that our model AudioCoGuide offers the promise of solving blind audio inverse problems via contrastive guidance.
Blind Audio Restoration using Contrastive Diffusion Guidance
Abstract
Results
The table below shows the quantitative metrics of Fréchet Audio Distance (FAD) [3] using VGGnet and PANNs [4] for the degraded recordings, LTAS baseline, and CoGuide.
| Method | VGG ↓ | PANN ↓ |
|---|---|---|
| Degraded (Original) | 2.52 | 0.39 |
| LTAS | 2.88 | 0.27 |
| CoGuide (Ours) | 0.84 | 0.19 |
Below, we show qualitative audio samples with spectrograms for comparison across different composers and pieces.
Beethoven
Original (Degraded)
LTAS
CoGuide
Chopin - Fantaisie
Original (Degraded)
LTAS
CoGuide
Chopin - Mazurka
Original (Degraded)
LTAS
CoGuide
Chopin - Sonata
Original (Degraded)
LTAS
CoGuide
Chopin - Waltz
Original (Degraded)
LTAS
CoGuide
Horowitz
Original (Degraded)
LTAS
CoGuide
Horowitz - Etude
Original (Degraded)
LTAS
CoGuide
Liszt
Original (Degraded)
LTAS
CoGuide
Mozart
Original (Degraded)
LTAS
CoGuide
Moszkowski
Original (Degraded)
LTAS
CoGuide
Rachmaninoff
Original (Degraded)
LTAS
CoGuide
Jungmann
Original (Degraded)
LTAS
CoGuide
↑ Back to Top