Skip to content

Commit 1d1bfd9

Browse files
committed
fix link
1 parent 6677f6b commit 1d1bfd9

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

Models/Vocoder/2024.08.14_PeriodWave.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ Although GAN-based models can generate the high-fidelity waveform signal fast, G
7676
Recently, the multi-band diffusion (MBD) model \citep{roman2023from} sheds light on the effectiveness of the diffusion model for high-resolution waveform modeling.
7777
Although previous diffusion-based waveform models ([DiffWave](2020.09.21_DiffWave.md); [WaveGrad](2020.09.02_WaveGrad.md)) existed, they could not model the high-frequency information so the generated waveform only contains low-frequency information.
7878
Additionally, they still require a lot of iterative steps to generate high-fidelity waveform signals.
79-
To reduce this issue, [PriorGrad](2021.06.11_PriorGrad.md) introduced a data-driven prior and [FastDiff](../Diffusion/2022.04.21_FastDiff.md) adopted an efficient structure and noise schedule predictor.
79+
To reduce this issue, [PriorGrad](2021.06.11_PriorGrad.md) introduced a data-driven prior and [FastDiff](2022.04.21_FastDiff.md) adopted an efficient structure and noise schedule predictor.
8080
However, they do not model the high-frequency information so these models only generate the low-frequency information well.
8181

8282
Above all, there is no generator architecture to reflect the natural periodic features of high-resolution waveform signals.
@@ -106,8 +106,8 @@ To reduce this issue, we adopt the DWT for more accurate frequency-wise vector f
106106

107107
[WaveNet](2016.09.12_WaveNet.md) has successfully paved the way for high-quality neural waveform generation tasks.
108108
However, these auto-regressive (AR) models suffer from a slow inference speed.
109-
To address this limitation, teacher-student distillation-based inverse AR flow methods ([Parallel WaveNet](../Vocoder/2017.11.28_Parallel_WaveNet.md_WaveNet.md); [ClariNet](../E2E/2018.07.19_ClariNet.md)) have been investigated for parallel waveform generation.
110-
Flow-based models ([FloWaveNet](../Vocoder/2018.11.06_FloWaveNet.mdoWaveNet.md); 2018.10.31_WaveGlow.mdWaveGlow.md); 2020.06.11_NanoFlow.mdNanoFlow.md)) have also been utilized, which can be trained by simply maximizing the likelihood of the data using invertible transformation.
109+
To address this limitation, teacher-student distillation-based inverse AR flow methods ([Parallel WaveNet](../Vocoder/2017.11.28_Parallel_WaveNet.md); [ClariNet](../E2E/2018.07.19_ClariNet.md)) have been investigated for parallel waveform generation.
110+
Flow-based models ([FloWaveNet](../Vocoder/2018.11.06_FloWaveNet.md); [WaveGlow](../Vocoder/2018.10.31_WaveGlow.md); [NanoFlow](../Vocoder/2020.06.11_NanoFlow.md)) have also been utilized, which can be trained by simply maximizing the likelihood of the data using invertible transformation.
111111

112112
### GAN-based Neural Vocoder
113113

@@ -126,7 +126,7 @@ Meanwhile, neural codec models ([SoundStream](../SpeechCodec/2021.07.07_SoundStr
126126

127127
[DiffWave](2020.09.21_DiffWave.md) and [WaveGrad](2020.09.02_WaveGrad.md) introduced a Mel-conditional diffusion-based neural vocoder that can estimate the gradients of the data density.
128128
[PriorGrad](2021.06.11_PriorGrad.md) improves the efficiency of the conditional diffusion model by adopting a data-dependent prior distribution for diffusion models instead of a standard Gaussian distribution.
129-
[FastDiff](../Diffusion/2022.04.21_FastDiff.md) proposed a fast conditional diffusion model by adopting an efficient generator structure and noise schedule predictor.
129+
[FastDiff](2022.04.21_FastDiff.md) proposed a fast conditional diffusion model by adopting an efficient generator structure and noise schedule predictor.
130130
Multi-band Diffusion \citep{roman2023from} incorporated multi-band waveform modeling into diffusion models and it significantly improved the performance by band-wise modeling because previous diffusion methods could not model high-frequency information, which only generated the low-frequency representations.
131131
This model also focused on raw waveform generation from discrete tokens of neural codec model for various audio generation applications including speech, music, and environmental sound.
132132

0 commit comments

Comments
 (0)