aishell复现结果和readme的结果不符 #4

brightLLer · 2024-08-21T12:21:41Z

各位大佬们好，我们在aishell1上复现了whisper large-v3 + qwen2 7B的实验，但发现模型的输出存在明显的"复读"(尾部若干字重复了许多遍)以及输出标点符号，特殊符号等情况，我们在推理的时候将大模型的repetition_penalty提高了，复读现象有所好转，但删除所有标点符号后字错率仍高达11%+，与README.md中的5.55%差距较大，以下是我们的训练命令（代码中whisper中提特征是80维的，我们添加了一个n_mel=128参数以支持large-v3）:

torchrun --standalone --nnodes=1 --nproc_per_node=8 train.py \
        --llm_model_name_or_path Qwen2-7B-Instruct \
        --whisper_model_name_or_path whisper/large-v3.pt \
        --data_path aishell/train/train.jsonl \
        --eval_data_path aishell/dev/eval.jsonl \
        --bf16 True \
        --output_dir Qwen-7B-Instruct-whisper-large-v3-aishell \
        --num_train_epochs 10 \
        --per_device_train_batch_size 16 \
        --per_device_eval_batch_size 8 \
        --gradient_accumulation_steps 8 \
        --evaluation_strategy "no" \
        --save_strategy "steps" \
        --save_steps 100 \
        --save_total_limit 10 \
        --learning_rate 3e-4 \
        --weight_decay 0.01 \
        --adam_beta2 0.95 \
        --warmup_ratio 0.01 \
        --lr_scheduler_type "cosine" \
        --logging_steps 1 \
        --report_to "none" \
        --model_max_length 512 \
        --n_mels 128 \
        --gradient_checkpointing \
        --dataloader_num_workers 4 \
        --dataloader_prefetch_factor 10 \
        --deepspeed ds_config_zero3.json

The text was updated successfully, but these errors were encountered:

robin1001 · 2024-08-22T01:30:29Z

可以按 readme 中给出的默认配置先跑跑试试看，多测几个中间模型。

brightLLer · 2024-08-22T06:34:14Z

可以按 readme 中给出的默认配置先跑跑试试看，多测几个中间模型。

readme里的配置是1.5B的，7B的也是这个配置吗，我们重新按照readme里的配置做了实验，但效果还没有我上面提问的那套配置好，模型的输出在胡说八道了....o(╥﹏╥)o

robin1001 · 2024-08-22T06:36:23Z

可以都用最大的，7B LLM 和 whisper large，跑跑上限。

brightLLer · 2024-08-22T07:14:54Z

可以都用最大的，7B LLM 和 whisper large，跑跑上限。

我们实验就是按这两个最大的跑的，训练了10个epoch，loss也降到非常低了，和readme里的曲线图基本一致，但wer也只能到11%左右，其中插入和替换错误比较多，看起来像是因为大模型本身的幻觉引起的...

KIP1024 · 2024-08-23T07:54:18Z

可以都用最大的，7B LLM 和 whisper large，跑跑上限。

我们实验就是按这两个最大的跑的，训练了10个epoch，loss也降到非常低了，和readme里的曲线图基本一致，但wer也只能到11%左右，其中插入和替换错误比较多，看起来像是因为大模型本身的幻觉引起的...

同学您好，可以咨询一下您训练所用的环境配置吗？我不知道需要什么样的显卡和训练流程。我的企鹅号是1147893880

robin1001 · 2024-08-26T01:54:59Z

8 卡 3090

KIP1024 · 2024-08-26T06:22:16Z

8 卡 3090

恩恩好的，多谢！

KIP1024 · 2024-09-04T07:25:16Z

8 卡 3090

彬哥，实验室有一张Tesla A100-40G的卡，可以玩QWen2 7B吗，打算微调一下做我们自己场景的语言模型，最后和声学模型结合做ASR

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aishell复现结果和readme的结果不符 #4

aishell复现结果和readme的结果不符 #4

brightLLer commented Aug 21, 2024

robin1001 commented Aug 22, 2024

brightLLer commented Aug 22, 2024

robin1001 commented Aug 22, 2024

brightLLer commented Aug 22, 2024

KIP1024 commented Aug 23, 2024

robin1001 commented Aug 26, 2024

KIP1024 commented Aug 26, 2024

KIP1024 commented Sep 4, 2024

aishell复现结果和readme的结果不符 #4

aishell复现结果和readme的结果不符 #4

Comments

brightLLer commented Aug 21, 2024

robin1001 commented Aug 22, 2024

brightLLer commented Aug 22, 2024

robin1001 commented Aug 22, 2024

brightLLer commented Aug 22, 2024

KIP1024 commented Aug 23, 2024

robin1001 commented Aug 26, 2024

KIP1024 commented Aug 26, 2024

KIP1024 commented Sep 4, 2024