求助！Qwen2VL进行lora微调后合并模型失败 #2495

gxlover0625 · 2024-11-25T03:15:45Z

Describe the bug

step1-lora微调（正常）

Qwen2VL进行lora微调，微调过程正常，没有出现bug。微调的命令如下

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 NPROC_PER_NODE=8 swift sft \
  --model_type qwen2-vl-7b-instruct \
  --model_id_or_path /home/llm/qwen/Qwen2-VL-7B-Instruct \
  --max_length 1024 \
  --sft_type lora \
  --lora_rank 8 \
  --lora_alpha 16 \
  --lora_dropout 0.0 \
  --dataset /home/Fixed-Train-Dataset/alignment-v3.jsonl \
  --learning_rate 0.0001 \
  --save_only_model true \
  --dataset_test_ratio 0.05 \
  --batch_size 8 \
  --eval_batch_size 8 \
  --num_train_epochs 3 \
  --gradient_accumulation_steps 2 \
  --lr_scheduler_type cosine \
  --warmup_ratio 0.1 \
  --eval_steps 50 \
  --save_steps 50 \
  --logging_steps 10 \
  --preprocess_num_proc 4 \
  --logging_dir /home/sft_middle_results/1125/alignment-v3 \
  --output_dir /home/sft_middle_results/1125/alignment-v3 \
  --save_strategy steps \
  --evaluation_strategy steps \
  --add_output_dir_suffix false

微调记录如下

step2-lora合并（正常）

使用以下命令进行lora合并权重，合并权重也正常

swift merge-lora \
  --ckpt_dir /home/sft_middle_results/1125/alignment-v3/checkpoint-891

step3-基于合并后的模型进行推理（bug）

processor = AutoProcessor.from_pretrained(
    model_path, 
    min_pixels=min_pixels, 
    max_pixels=max_pixels
)

代码报错Exception: data did not match any variant of untagged enum ModelWrapper at line 757371 column 3

这个问题请问团队能解决吗？求助

Your hardware and system info
Write your system info like CUDA version/system/GPU/torch version here(在这里给出硬件信息和系统信息，如CUDA版本，系统，GPU型号和torch版本等)
linux，8卡H20，torch2.5.1+cuda124，cuda是11.8

Additional context
Add any other context about the problem here(在这里补充其他信息)
原始qwen2vl的文件夹

合并lora后的文件夹

训练过程loss正常下降

The text was updated successfully, but these errors were encountered:

Jintao-Huang · 2024-11-25T05:40:59Z

same issue here: #2494

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

求助！Qwen2VL进行lora微调后合并模型失败 #2495

求助！Qwen2VL进行lora微调后合并模型失败 #2495

gxlover0625 commented Nov 25, 2024

Jintao-Huang commented Nov 25, 2024

求助！Qwen2VL进行lora微调后合并模型失败 #2495

求助！Qwen2VL进行lora微调后合并模型失败 #2495

Comments

gxlover0625 commented Nov 25, 2024

step1-lora微调（正常）

step2-lora合并（正常）

step3-基于合并后的模型进行推理（bug）

Jintao-Huang commented Nov 25, 2024