语音识别数据增强
2.一段语音,一段文字,随意拆分的话,语音要拆分,文字也要对应拆分。
·
目录
Whisper-Finetune的数据增强
https://github.com/yeyupiaoling/Whisper-Finetune
https://github.com/yeyupiaoling/Whisper-Finetune/blob/master/configs/augmentation.json
[
{
"type": "resample",
"params": {
"new_sample_rates": [8000, 32000, 44100]
},
"prob": 0.0
},
{
"type": "noise",
"params": {
"min_snr_dB": 10,
"max_snr_dB": 50,
"noise_dir": "dataset/noise"
},
"prob": 0.2
},
{
"type": "speed",
"params": {
"min_speed_rate": 0.9,
"max_speed_rate": 1.1,
"num_rates": 3
},
"prob": 0.5
},
{
"type": "shift",
"params": {
"min_shift_ms": -5,
"max_shift_ms": 5
},
"prob": 0.0
},
{
"type": "volume",
"params": {
"min_gain_dBFS": -15,
"max_gain_dBFS": 15
},
"prob": 0.5
}
]
其他数据增强:
1.语音合成数据增强:
2.一段语音,一段文字,随意拆分的话,语音要拆分,文字也要对应拆分。

DAMO开发者矩阵,由阿里巴巴达摩院和中国互联网协会联合发起,致力于探讨最前沿的技术趋势与应用成果,搭建高质量的交流与分享平台,推动技术创新与产业应用链接,围绕“人工智能与新型计算”构建开放共享的开发者生态。
更多推荐
所有评论(0)