LoRA (Low-Rank Adaptation) enables parameter-efficient fine-tuning of large language models like LLaMA-2 by injecting trainable low-rank matrices into each layer, drastically reducing the number of trainable parameters.
!pip install peft
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
quant_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16
)
model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Llama-2-7b-chat-hf",
quantization_config=quant_config,
device_map="auto",
token=token
)
tokenizer = AutoTokenizer.from_pretrained(
"meta-llama/Llama-2-7b-chat-hf",
use_fast=False,
token=token
)
tokenizer.pad_token = tokenizer.eos_token
from peft import LoraConfig, get_peft_model
lora_config = LoraConfig(
r=8,
lora_alpha=32,
target_modules=["q_proj", "v_proj"], # LLaMA uses q_proj, v_proj
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM"
)
model = get_peft_model(model, lora_config)
model.print_trainable_parameters()
Explanation:
r
: LoRA rank (controls the size of the injected matrices)lora_alpha
: Scaling factor for LoRA updatestarget_modules
: Which modules to adapt (for LLaMA, use q_proj
and v_proj
)lora_dropout
: Dropout for LoRA layersbias
: Whether to adapt bias termstask_type
: Type of task (here, causal language modeling)from transformers import TrainingArguments, Trainer, DataCollatorForLanguageModeling
training_args = TrainingArguments(
output_dir="./lora-llama2-about_me",
per_device_train_batch_size=2,
gradient_accumulation_steps=4,
learning_rate=2e-4,
num_train_epochs=20,
logging_steps=10,
save_strategy="epoch",
eval_strategy="epoch",
fp16=True,
push_to_hub=False
)
data_collator = DataCollatorForLanguageModeling(tokenizer, mlm=False)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=tokenized_dataset["train"],
eval_dataset=tokenized_dataset["test"],
data_collator=data_collator
)
Below is a visual representation of how LoRA adapts the model’s architecture by injecting low-rank adapters into the attention layers:
Diagram Explanation:
Previous: Setup
Next: Training