目录

Whisper-Finetune的数据增强

其他数据增强:


Whisper-Finetune的数据增强

https://github.com/yeyupiaoling/Whisper-Finetune

https://github.com/yeyupiaoling/Whisper-Finetune/blob/master/configs/augmentation.json

[
  {
    "type": "resample",
    "params": {
      "new_sample_rates": [8000, 32000, 44100]
    },
    "prob": 0.0
  },
  {
    "type": "noise",
    "params": {
      "min_snr_dB": 10,
      "max_snr_dB": 50,
      "noise_dir": "dataset/noise"
    },
    "prob": 0.2
  },
  {
    "type": "speed",
    "params": {
      "min_speed_rate": 0.9,
      "max_speed_rate": 1.1,
      "num_rates": 3
    },
    "prob": 0.5
  },
  {
    "type": "shift",
    "params": {
      "min_shift_ms": -5,
      "max_shift_ms": 5
    },
    "prob": 0.0
  },
  {
    "type": "volume",
    "params": {
      "min_gain_dBFS": -15,
      "max_gain_dBFS": 15
    },
    "prob": 0.5
  }
]

其他数据增强:

1.语音合成数据增强:

2.一段语音,一段文字,随意拆分的话,语音要拆分,文字也要对应拆分。

Logo

DAMO开发者矩阵,由阿里巴巴达摩院和中国互联网协会联合发起,致力于探讨最前沿的技术趋势与应用成果,搭建高质量的交流与分享平台,推动技术创新与产业应用链接,围绕“人工智能与新型计算”构建开放共享的开发者生态。

更多推荐