[EMNLP 2024] Breaking ReLU Barrier: Generalized Mixture of Experts for Dense Pretrained Models

This code is based on MoEfication.

We provide example script for MoEfying BLOOM. Code for MoEfying mBERT is also provided.

Expert Construction

export model_path=BLOOM_PATH
export moe_workspace=WORKSPACE_PATH
export ffn_template='h.{}.mlp.dense_h_to_4h.weight'
export ffn_out_template='h.{}.mlp.dense_4h_to_h.weight'
export num_expert=64
export num_layers=24 # for bloom-1.7b
export last_layer=23 # for bloom-1.7b
python moefication/param_cluster_example.py --model_path=${model_path}/pytorch_model.bin --res_path=${moe_workspace}/ --templates=${ffn_template} --num-expert=${num_expert} --num-layer=${num_layers}

export corpus_file=CORPUS_PATH
export sample_size=10000
python examples/clm-inf-pattern-bloom.py --train_file=$corpus_file --model_name_or_path=${model_path} --res_path=${moe_workspace}/0 --templates=${ffn_template} --split_size=${sample_size} --start_split=0 --per_device_train_batch_size=1 # can be splitted into parallel runs by controlling split parameters
python examples/concat_multiple.py --res_path=${moe_workspace}/ --templates=${ffn_template} --num_layers=${num_layers}


python examples/calculate_mean.py --res_path=${moe_workspace}/ --templates=${ffn_out_template} --start_layer={layer_id} --end_layer={layer_id}
python moefication/mlp_select_example.py --config_path=${model_path} --mean_template=${ffn_out_template}_mean --templates=${ffn_template} --model_path=${model_path}/pytorch_model.bin --res_path=${moe_workspace}/ --num-expert=${num_expert} --start_layer=0 --end_layer=${last_layer}

Next, you may evaluate the model using the lm-evaluation-harness provided. The evaluation code is a modified version of lm-evaluation-harness.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.idea		.idea
examples		examples
lm-evaluation-harness		lm-evaluation-harness
moefication		moefication
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[EMNLP 2024] Breaking ReLU Barrier: Generalized Mixture of Experts for Dense Pretrained Models

Expert Construction

About

Releases

Packages

Languages

thnkinbtfly/G-MoEfication

Folders and files

Latest commit

History

Repository files navigation

[EMNLP 2024] Breaking ReLU Barrier: Generalized Mixture of Experts for Dense Pretrained Models

Expert Construction

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages