Krea 2: Run the New 12B Open-Weights Image Model

Krea 2 just dropped as open weights - a 12B aesthetic-first image model. Here's how to actually run Raw and Turbo locally without wasting a day.

Taylor Kim2026-06-247 min readBeginner

Two ways to try Krea 2 right now: spin up the BF16 Raw weights through the diffusers library and watch your 24GB card sweat, or grab the community FP8 Turbo build and generate a 2K image in roughly two seconds. The second path is the right one for almost everyone – Raw isn’t built for direct inference anyway, it’s a base for training. Yet half the tutorials published this week walk readers straight into the Raw setup. We’ll skip that detour.

This just dropped (June 23-24, 2026 on Hugging Face) and the community ran with it overnight – within hours of release, Krea 2 was running in ComfyUI, quantized variants were appearing, and the first wave of style experiments were live. Here’s what’s actually worth your time.

What Krea 2 is, in 60 seconds

Krea 2 uses a Qwen Image VAE, a 12B dense DiT backbone, and a Qwen3-VL text encoder with multi-layer feature aggregation. It ships as two checkpoints with very different jobs: Raw is a pretrained checkpoint with no distillation – diverse, malleable, built for fine-tuning and LoRA training. Raw is not tuned for final-quality generation on its own. Turbo is the one you actually generate with.

The numbers worth remembering: Turbo targets around two seconds at native 2K resolution on consumer hardware, using eight inference steps with classifier-free guidance disabled. Raw needs 52 steps and CFG 3.5. Per Krea’s technical report, Krea 2 sits in the top 10 on the Artificial Analysis text-to-image leaderboard, second among independent labs.

One naming trap before you download anything: Krea 2 is Krea AI’s own from-scratch 12B DiT released June 23, 2026 – a completely different model from the 2025 black-forest-labs/FLUX.1-Krea-dev (a BFL × Krea collaboration built on FLUX). Don’t mix their weights, sizes, or workflows. Mixing them silently breaks generations.

The hands-on path: Krea 2 Turbo in ComfyUI

Update first – ComfyUI 0.26.0 or later is required because older builds don’t recognize the krea2 architecture tag.

Three files go in three folders. The FP8 Turbo build is the right default for most people – it reduces the model from 24.76 GiB (BF16) down to 12.01 GiB, making it runnable on 16GB and 24GB GPUs without sacrificing output quality.

ComfyUI/
├── models/
│ ├── diffusion_models/
│ │ └── krea2_turbo_fp8_scaled.safetensors
│ ├── text_encoders/
│ │ └── qwen3vl_4b_fp8_scaled.safetensors
│ ├── vae/
│ │ └── qwen_image_vae.safetensors
│ └── loras/
│ └── krea2_warmpastel.safetensors

Grab the files from the official ComfyUI Krea-2 guide (Comfy-Org/Krea-2 repo). Load the workflow template, point each loader at the right file, and one detail people miss: load the Qwen3-VL 4B text encoder in ComfyUI’s CLIPLoader with type set to krea2. Default CLIP type won’t work.

Sampler settings from community workflows: Krea 2 Turbo runs at 1024-2048 resolution, 8 steps, CFG 0-1 (disabled), sampler er_sde, scheduler simple. Queue the prompt. That’s the loop.

Pro tip: Fix the seed before you start tweaking. Krea 2’s whole sell is aesthetic variation, which means tiny prompt edits look like totally different generations unless you lock the seed first – useful when comparing prompt phrasing or testing a style LoRA on/off.

Pitfalls people are hitting this week

The GGUF route looks attractive for 8-12GB cards but has a sharp edge. Standard GGUF custom nodes throw an "Unexpected architecture type" error natively – because the loaders don’t yet recognize the krea2 architecture tag in GGUF metadata. You need a patched custom node fork. If you’re seeing that error, you didn’t install the wrong model; you installed the right model with the wrong loader.

Raw isn’t plug-and-play in ComfyUI either. The Load Diffusion Model node expects a ComfyUI-format single-file checkpoint. The official raw.safetensors uses Krea’s reference inference.py key layout – not the ComfyUI key layout. So this path is turnkey only once a ComfyUI-format Raw checkpoint is published. Until then, Raw means either community quants (which need a ComfyUI build with comfy_quant support) or running BF16 through diffusers with CPU offload – slow.

The license angle is different from what most coverage suggests. The 50-seat threshold isn’t the only catch. Under the Krea 2 Community License, deployers must implement content filtering measures or equivalent review processes regardless of team size. A solo dev shipping a public demo without input/output classifiers is technically not compliant – even on the free tier.

A brief aside

The speed at which the community quantized this model is worth noting for a different reason. Within hours of the weights dropping, FP8, NVFP4, and MXFP8 variants were live on Hugging Face – community members effectively did in an afternoon what used to require a dedicated inference team. That’s not unique to Krea 2, but it does suggest the bottleneck for running frontier-quality image models locally has genuinely shifted from “can you get the weights” to “do you know which node fork to use.”

What you actually get: VRAM vs. variants

The quant options moved fast. Here’s the picture for a single 16GB card, pulled from community testing as of late June 2026:

Variant	Size	Notes
BF16 Turbo	24.76 GiB	24GB+ cards only
FP8 Turbo (e4m3fn)	12.01 GiB	Recommended default, 16GB fits
NVFP4 Turbo	7.15 GiB	Smallest, Blackwell-friendly
MXFP8 Turbo	12.60 GiB	Alternative FP8 format
Raw BF16	~24.76 GiB	Training only, not inference (community-reported, may vary)

One non-obvious reason 16GB works for FP8 Turbo: ComfyUI loads and runs the Qwen3-VL text encoder to encode your prompt, then frees it before the diffusion sampling stage – so the encoder and the 12.01 GiB diffusion model are not both resident in VRAM at peak. If you swap to a workflow that keeps the encoder loaded for batching, your VRAM math changes.

Sizes sourced from AlperKTS/Krea2_FP8 and community benchmarks, as of late June 2026.

When NOT to reach for Krea 2

Krea’s own positioning is “aesthetic-first” – that framing matters. Infographics, charts, screenshots with readable UI text – not this model’s job. Dense text rendering is improving across image models but aesthetic quality and text accuracy are different optimization targets.

Skip Krea 2 if you need photoreal product compositing with strict brand consistency from day one – that’s a LoRA-training problem, and training a usable LoRA isn’t a 10-minute task. Non-technical users who just want “a good image fast” inside a hosted UI will find the Krea web product (paid) more practical than a local install.

And if you’re an organization above the seat threshold or shipping a customer-facing product without a moderation pipeline, the license closes the door before the model does.

A question worth sitting with

Open weights at this quality tier used to mean Stable Diffusion forks. Now an independent lab with a top-10 ranked model is handing out a 12B checkpoint and saying “go train your own LoRAs.” Does that mean the SaaS image generators get squeezed, or does it just shift the moat from model access to model orchestration? Genuinely unclear yet – worth watching what happens to FLUX pricing over the next month.

FAQ

Can I run Krea 2 Turbo on 8GB VRAM?

Yes, via NVFP4 (7.15 GiB) or smaller GGUF quants – but you’ll need the patched GGUF custom node, and quality drops noticeably below FP8.

Do existing FLUX or SD LoRAs work with Krea 2?

No. Krea 2 is a completely separate architecture – single-stream DiT with Qwen3-VL as text encoder, not T5 or CLIP. LoRAs are model-specific. That said, Krea released official style LoRAs (krea2_warmpastel, krea2_coolblue, krea2_darkbrush, krea2_plasmoid) trained on Raw and designed to apply on Turbo – your starting set until the community builds more.

Is the 2-second generation claim realistic on my hardware?

It’s measured on consumer GPUs running FP8 Turbo at 8 steps with CFG disabled. Drop any of those – switch to BF16, raise the step count, or enable CFG – and the time climbs. The claim isn’t marketing fiction, but it’s a specific configuration, not a hardware-agnostic floor.

Next step: update ComfyUI to 0.26.0, grab krea2_turbo_fp8_scaled.safetensors from Comfy-Org/Krea-2, drop in the template workflow, and run one prompt with each official style LoRA. Twenty minutes to a real feel for what this model can do.