The Stable Diffusion LoRA file you downloaded last night? Probably under 10 MB. That’s not a compressed archive – that’s the entire trained adapter. The official PEFT README puts the number at 8.8 MB for a full Stable Diffusion LoRA checkpoint. The same library shrinks a T0_3B fine-tune from 11 GB down to 19 MB by training only 0.19% of parameters on bigscience/mt0-large. This guide covers deploying Hugging Face PEFT v0.19.1 on a fresh machine – the version-specific details, the configuration that actually runs, and the install-time traps that nobody warns you about.
Why version matters right now
PEFT 0.19.1 landed on April 16, 2026. Two days earlier: 0.19.0. Before that: 0.18.1 (Jan 9, 2026) and 0.18.0 (Nov 13, 2025). The cadence has accelerated – and so has the breakage surface. The hard constraint: Transformers v5 is incompatible with PEFT below 0.18.0 (per the PEFT release notes). Older blog posts and StackOverflow threads from 2024 will quietly break on a fresh install that pulls Transformers v5. Pin 0.19.1 and skip the debugging session.
One thing changed in 0.19: add_weighted_adapter now accepts negative weights. Previously, only positive values worked. If you’re merging adapters, that unlocks subtraction-style composition for the first time.
System requirements
| Component | Required | Notes |
|---|---|---|
| Python | 3.10+ | 3.11+ recommended; 3.9 was dropped in 0.18 |
| OS | Linux / macOS / Windows (WSL2) | Linux preferred for CUDA support |
| Core deps | PyTorch, transformers, accelerate | peft >= 0.18.0 required for Transformers v5 integration |
| Optional | bitsandbytes | Required for QLoRA / 4-bit quantization |
Docs caveat: the official install page still lists Python 3.9+ as supported, but 0.18 dropped it when 3.9 reached end of life. Install PEFT on Python 3.9 and pip silently resolves to an old 0.17.x release. That’s a real foot-gun – check your Python version before anything else.
Install Hugging Face PEFT (the recommended path)
Source of truth: github.com/huggingface/peft and PyPI. Create a clean virtualenv first – mixing PEFT versions across projects is how the target_modules error gets weird:
# 1. fresh env
python3.11 -m venv .venv
source .venv/bin/activate
# 2. core stack
pip install --upgrade pip
pip install "peft==0.19.1" transformers accelerate datasets
# 3. optional: 4-bit quantization for QLoRA
pip install bitsandbytes
Two alternatives if you need bleeding edge or want to contribute:
# latest unreleased main (may be buggy)
pip install git+https://github.com/huggingface/peft
# editable / contributor install
git clone https://github.com/huggingface/peft
cd peft
pip install -e .[test]
The git+ install grabs features not yet in a release – useful for testing a fix, risky for production. The [test] editable install is what the official PEFT install guide recommends for contributors.
First-time configuration that actually runs
Skip the recycled Llama-7B QLoRA demo. Here’s the smallest config that proves your install works – a LoRA adapter on a 0.5B Qwen model, runnable on a laptop GPU:
from transformers import AutoModelForCausalLM
from peft import LoraConfig, get_peft_model
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-0.5B")
lora_config = LoraConfig(
r=8,
lora_alpha=16,
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM",
# target_modules omitted on purpose - see below
)
model = get_peft_model(model, lora_config)
model.print_trainable_parameters()
# trainable params: ~1M || all params: 494M || trainable%: 0.20%
Turns out PEFT already knows the right layer names for common architectures. For Llama, Gemma, Qwen, and similar, predefined default targets (like q_proj and v_proj) mean you can skip target_modules entirely – per the Transformers PEFT docs. Pass it explicitly only when targeting non-default layers or an architecture PEFT doesn’t recognize. Most tutorials predate this feature and still hard-code the module names.
Verify it works
Three quick checks. Each catches a different failure mode.
- Version sanity:
python -c "import peft; print(peft.__version__)"– must print0.19.1. - Transformers integration:
python -c "from transformers import AutoModel; m = AutoModel.from_pretrained('hf-internal-testing/tiny-random-OPTForCausalLM'); print(hasattr(m, 'add_adapter'))"– must printTrue. When peft >= 0.18.0 is installed,PeftAdapterMixingets added to allPreTrainedModelclasses automatically. - Adapter save round-trip: after running the LoRA snippet above, call
model.save_pretrained("./test-adapter"). The directory should contain onlyadapter_model.safetensorsandadapter_config.json. Full base weights appearing there means something is wrong with your version.
Pro tip: Run
model.print_trainable_parameters()immediately afterget_peft_model. If it prints 100% trainable, your LoRA didn’t attach – usually because target_modules was wrong for the architecture. Fix this before you waste GPU hours.
Common install errors and the actual fixes
The most common error has nothing to do with installation per se. It shows up the first time you try to attach a LoRA to a non-default architecture.
Error 1:ValueError: Target modules ['q_proj', 'v_proj'] not found in the base model. Module names in your LoraConfig don’t exist in the architecture – BioGPT doesn’t expose layers the way Llama does. Fix: run print(model), copy actual linear layer names, then either name them explicitly or pass target_modules="all-linear".
Error 2: Same error, but only after torch.compile. Compiled modules gain an _orig_mod. prefix in the state dict – documented in PEFT issue #2957 (Dec 2025) – so PEFT’s target matcher fails with ‘Target modules {transformer_blocks.0.attn.to_q} not found’. Fix: attach the LoRA before calling torch.compile, not after.
Error 3:ValueError: Please specify target_modules in peft_config when using TRL or a third-party trainer with a model PEFT doesn’t have defaults for. Don’t rely on auto-detection for custom or less common architectures. Set target_modules explicitly.
Error 4 (the new one, as of Transformers v5): Training a MoE model like Qwen3-30B-A3B with DPO. Expert modules in fused MoE layers are nn.Parameter tensors rather than individual nn.Linear layers, so LoRA needs the target_parameters argument – but PEFT currently allows only one adapter per model when target_parameters is used (PEFT issue #2710). That single-adapter limit crashes DPOTrainer’s reference adapter creation (TRL issue #5222). Workaround: downgrade to Transformers 4.x for this specific setup. No clean fix exists as of this writing.
Something worth sitting with: four distinct error types, and three of them have nothing to do with PEFT itself – they’re caused by interaction effects with the compiler, the trainer, or the model architecture. That’s the actual install challenge in 2026. PEFT installs fine. The ecosystem around it is where things break.
Upgrade and uninstall
Upgrading from 0.17.x? Python version first (3.10+ required), then:
pip install --upgrade "peft==0.19.1" transformers
# or for Transformers v5
pip install --upgrade "peft>=0.18.0" "transformers>=5.0"
To uninstall cleanly:
pip uninstall peft
# adapters on disk are just folders with safetensors + json
rm -rf ./your-adapter-dir
Adapter directories are portable. Delete the package and your saved adapter_model.safetensors still loads anywhere a compatible PEFT version exists. That’s the point: the adapter is the artifact, not the environment.
FAQ
Do I need a GPU just to install PEFT?
No. The pip install is CPU-only Python. GPU only enters the picture when you train or run inference on a base model.
Can I run PEFT on Apple Silicon?
The library installs and imports fine on M-series Macs. Training works with PyTorch’s MPS backend for smaller models. The catch: bitsandbytes – required for QLoRA / 4-bit quantization – has limited MPS support, so you’re doing fp16 LoRA, not 4-bit. For anything beyond a small base model, a Linux GPU box is the practical path.
Why does pip install peft pull a different version than I asked for?
Two likely causes. First: you’re on Python 3.9, and pip silently resolves to the last 0.17.x release that supported it – that’s the silent downgrade trap described above. Second: another package (often an older version of transformers or trl) has pinned peft to a lower upper bound. Run pip install "peft==0.19.1" with an exact pin and read pip’s resolver output. It’ll tell you exactly which package is blocking the upgrade.
Next step: PEFT verified, attach a LoRA to a real base model and train for one step on a tiny dataset before scaling up. The first end-to-end run is where the remaining environment issues surface – far cheaper to catch them on 100 examples than 100,000.