Skip to content

Commit cdc55e2

Browse files
committed
更新摘要
1 parent 5425cb5 commit cdc55e2

File tree

1 file changed

+16
-4
lines changed

1 file changed

+16
-4
lines changed

Models/SVS/2025.01.23_Everyone-Can-Sing.md

+16-4
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# 标题
1+
# Everyone-Can-Sing
22

33
<details>
44
<summary>基本信息</summary>
@@ -11,8 +11,8 @@
1111
- 04 Zeyu Jin
1212
- 链接:
1313
- [ArXiv](https://arxiv.org/abs/2501.13870)
14-
- [Publication]()
15-
- [Github]()
14+
- [Publication]
15+
- [Github]
1616
- [Demo](http://everyone-can-sing.github.io/)
1717
- 文件:
1818
- [ArXiv](_PDF/2501.13870v1__Everyone-Can-Sing__Zero-Shot_Singing_Voice_Synthesis_and_Conversion_with_Speech_Reference.pdf)
@@ -22,15 +22,27 @@
2222

2323
## Abstract: 摘要
2424

25+
<details>
26+
<summary>展开原文</summary>
27+
2528
We propose a unified framework for Singing Voice Synthesis (SVS) and Conversion (SVC), addressing the limitations of existing approaches in cross-domain SVS/SVC, poor output musicality, and scarcity of singing data.
2629
Our framework enables control over multiple aspects, including language content based on lyrics, performance attributes based on a musical score, singing style and vocal techniques based on a selector, and voice identity based on a speech sample.
2730
The proposed zero-shot learning paradigm consists of one SVS model and two SVC models, utilizing pre-trained content embeddings and a diffusion-based generator.
2831
The proposed framework is also trained on mixed datasets comprising both singing and speech audio, allowing singing voice cloning based on speech reference.
2932
Experiments show substantial improvements in timbre similarity and musicality over state-of-the-art baselines, providing insights into other low-data music tasks such as instrumental style transfer.
3033
Examples can be found at: [this http URL](http://everyone-can-sing.github.io/).
3134

32-
## 1·Introduction: 引言
35+
</details>
36+
<br>
37+
38+
我们提出了一个统一框架, 用于歌声合成 (Singing Voice Synthesis, SVS) 和歌声转换 (Singing Voice Conversion, SVC), 以解决现有方法在跨域歌声合成/转换中的局限性, 输出音乐性差, 和歌唱数据稀缺性.
39+
我们的框架实现多个方面的控制, 包括基于歌词的语言内容, 基于乐谱的表演属性, 基于选择器的歌唱风格和声乐技巧, 基于语音样本的声音身份.
40+
所提出的零样本学习范式由一个 SVS 模型和两个 SVC 模型组成, 利用预训练的内容嵌入和基于扩散模型的生成器.
41+
所提出的框架也在由歌声和语音音频混合的数据集上训练, 以允许基于语音参考进行歌声克隆.
42+
实验表明在音色相似性和音乐性方面相比于现有的基线有显著的提升, 为其他数据少的音乐任务 (如乐器风格转换) 提供了见解.
43+
示例可以在[此链接](http://everyone-can-sing.github.io/)找到.
3344

45+
## 1·Introduction: 引言
3446

3547
Singing voice synthesis (SVS), which generates singing voice signals from music scores, is gaining increasing importance in generative AI and benefiting various applications in music production and entertainment.
3648
Recent advances in deep-learning-based audio synthesis, such as acoustic models ([FastSpeech2](../Acoustic/2020.06.08_FastSpeech2.md)[^1]), neural vocoders ([HiFi-GAN](../Vocoder/2020.10.12_HiFi-GAN.md)[^2]; [BigVGAN](../Vocoder/2022.06.09_BigVGAN.md)[^3]), and tokenizer-based codec models ([DAC](../SpeechCodec/2023.06.11_Descript-Audio-Codec.md)[^4]; [SoundStream](../SpeechCodec/2021.07.07_SoundStream.md)[^5]), have greatly improved models' ability to reproduce singing voices from training data [^6] [^7] [^8].

0 commit comments

Comments
 (0)