gpt oss expert layers

September 30, 2025


found from this guide by openai that if you do LoRA finetuning, you should target the projection layers within the expert modules as well.

from peft import LoraConfig, get_peft_model peft_config = LoraConfig( r=8, lora_alpha=16, target_modules="all-linear", target_parameters=[ "7.mlp.experts.gate_up_proj", "7.mlp.experts.down_proj", "15.mlp.experts.gate_up_proj", "15.mlp.experts.down_proj", "23.mlp.experts.gate_up_proj", "23.mlp.experts.down_proj", ], ) peft_model = get_peft_model(model, peft_config) peft_model.print_trainable_parameters()

i still have issues trying to finetune this model, and it's not clear to me that i'm doing things right. but i am having fun. i just wish this was a month long research project rather than a one week speedrun to get my project done.