Last week: three hours fixing hands in AI portraits. Inpainting, negative prompts, ControlNet depth maps. Some worked. Most didn’t. Then I figured out the real question: when to fix vs. when to just regenerate.
Turns out the answer isn’t “always inpaint.” Sometimes regenerating is 5x faster. Sometimes inpainting makes things worse.
Fix or Regenerate? The 2-Minute Decision
Here’s the break-even: if your image has 2-3 small issues (wonky finger, slightly blurred face), fix it. If hands are unrecognizable blobs or you’ve got multiple problems (both hands broken, face distorted, weird lighting), regenerate with better settings. Community testing shows fixing takes 2-5 minutes per hand when it works – 15+ when it doesn’t.
The rule? If you can’t immediately see what the hand should look like, don’t try. The AI won’t know either.
Most tutorials jump straight to inpainting. Backwards. Decide first, then commit.
Inpainting’s Hidden Failure Mode
The model is context-aware. Sounds smart. Except when you mask a hand surrounded by nothing but shirt fabric, the model sees only shirt and cannot generate a hand from that.
Think of it like asking someone to draw a hand while blindfolded, but you describe only the sleeve nearby. They’ll guess. Badly.
Always include part of the arm, torso, or an object the hand is holding in your mask. Give the model something hand-adjacent. A face in the padding area helps too – tells the AI “this is a person.”
I learned this trying to add a hand reaching into frame on a plain background. Twenty generations, zero hands. Expanded the mask to include the shoulder – worked on try two.
The Inpainting Setup
Upscale first. Everyone skips this. Your image is 1024px? Upscale it to 3000-4000px before inpainting (as of 2025, tested across Pincel workflows). The difference is dramatic – same technique, wildly better results. Use Upscayl with Ultrasharp model, or Topaz Gigapixel if you’re paying.
Mask generously – don’t just circle the bad hand. Include wrist, part of forearm, anything the hand touches. For faces, hair and neck too. Set inpaint area to “Only Masked” (not “Whole Picture”) so it scales up the masked region internally.
Denoise: 0.7-0.8 for the first pass. You want change. Stable Diffusion Art testing (2024-2025) puts 0.75 as the sweet spot. For refinement passes (fixing a finger that’s slightly too long), drop to 0.2-0.5.
Prompt for the area. Not your original prompt. Write “detailed hand, five fingers, natural skin texture” or “beautiful face, symmetric features, detailed eyes.” Generate 3-5 variations. Pick the best. None work? You’re in the regeneration zone.
ControlNet (When You Need a Specific Pose)
Inpainting gives you randomness. ControlNet with depth maps gives you control: use DWPose or HandRefiner to detect the existing (broken) hand structure, generate a depth map showing where fingers should be, then inpaint guided by that. Testing in ComfyUI (as of 2025) shows 90% fix rate when settings are dialed in.
Setup in AUTOMATIC1111: install ControlNet extension – download depth ControlNet model (or HandRefiner for hand-specific training) – enable ControlNet in inpainting, select preprocessor depth_hand_refiner – upload your broken image as reference – set control weight to 0.6-0.7, not 1.0 (that kills texture).
The depth map extracts hand shape and tells the model “five fingers go here, thumb here, palm here.” More consistent than pure inpainting. Also slower and more complex.
When HandRefiner Fails
HandRefiner reconstructs a 3D mesh from your image, then uses that to guide generation. But there’s a hard limit: if the original hand is so malformed it’s unrecognizable to human eyes, mesh reconstruction fails completely. No mesh, no fix.
You also won’t fix hand size – if your image has a giant broken hand, HandRefiner gives you a giant correct hand. It fits the mesh to what’s there, not what should be there.
Prevention: Newer Models
Midjourney V7 (released April 2025, default June 2025): 85-90% correct hands on simple poses. Stable Diffusion 3.5 (October 2024): fixed anatomical issues. For open-source, SDXL-based models (Pony Diffusion, Illustrious) beat SD 1.5 significantly (as of 2025). Not perfect – no model is – but you’ll spend less time fixing.
Prompting? Positive works better than negative. “Detailed hands, five fingers, natural gesture” primes the model to pay attention. Negative prompts like “extra fingers, bad hands, deformed” reduce some failures (fewer six-fingered hands) but don’t improve pose quality or joint alignment (2025 analysis).
Switching models solves more problems than any technique.
Three Settings That Break Good Hands
Padding too small. The “Only Masked Padding” slider defaults to 32 pixels. Fine for faces. For hands? You need 64-128 pixels so the model sees arm, torso, context. Without it, the hand floats in a void.
Control strength at 1.0. Using HandRefiner or depth ControlNet? The research paper warns control strength of 1.0 causes texture loss. Skin looks plastic. Use 0.4-0.8. Most tutorials skip this.
Wrong iterative strategy. Fixing hands takes multiple passes. The trick: start high denoise and low steps (0.7 denoise, 20 steps), then lower denoise and raise steps for refinement (0.4 denoise, 40 steps). Going the opposite direction adds noise instead of refining.
The Decision Tree
| Situation | Action | Why |
|---|---|---|
| Hand has 6 fingers, otherwise normal | Inpaint | Clear target, high success rate |
| Fingers are blob/unrecognizable | Regenerate | No mesh to reconstruct, inpainting guesses randomly |
| Thumb too long, everything else fine | Inpaint with low denoise (0.3) | Minor fix, preserve the rest |
| Both hands broken + face blurry | Regenerate with better model/prompt | Fixing multiple areas takes longer than regenerating |
| Hand in complex pose (interlaced fingers) | ControlNet depth or regenerate | Basic inpainting won’t maintain pose structure |
| Face has wrong expression, anatomy OK | Inpaint face only, denoise 0.2 | Faces are easier – model has tons of face training data |
When in doubt? Try three regenerations with a better prompt. If all three still have broken hands, then commit to fixing.
Face Fixes
Faces are easier. The model has seen way more faces than hands in training, and faces don’t have infinite pose variability.
Quick fix: built-in face restoration (CodeFormer or GFPGAN in AUTOMATIC1111). Enable it, adjust weight to the lowest value that works. Higher weight = more effect but style shift. Handles blurry eyes, weird teeth, asymmetry.
For expression changes or major fixes? Inpaint with denoise 0.2-0.4. Faces tolerate low denoise well – the model knows what a face should look like. You’re just nudging it toward a better sample.
One trick: garbled faces on small background figures? Upscale the whole image first (so the face is bigger), then fix. Insufficient pixel coverage is often the real problem.
FAQ
Do negative prompts actually help with hands?
A bit. “No extra fingers, correct anatomy” reduces the worst failures – fewer six-fingered hands. But they don’t improve joint alignment or pose plausibility. Testing shows overusing them can degrade overall quality. Better: positive descriptive prompts (“detailed five-fingered hand, natural gesture”).
How many times should I try inpainting before giving up?
Three to five generations. None close? You’re fighting the model. Your mask lacks context, the original hand is too broken to reconstruct, or you need ControlNet guidance. I once spent 40 minutes on a hand that was never going to work – should’ve regenerated after attempt four. Community consensus: if fixing takes longer than regenerating the whole image three times, regenerate.
Can I use these techniques with Midjourney or DALL-E?
Midjourney and DALL-E don’t expose inpainting controls the same way Stable Diffusion does. Midjourney V7 has “Vary (Region)” – select the broken area, regenerate. Works like inpainting. DALL-E 3 (via ChatGPT) lets you ask conversationally: “Regenerate with the left hand showing five fingers clearly.” Both are simpler but less controllable. For maximum control, you need Stable Diffusion with AUTOMATIC1111 or ComfyUI. But honestly? Midjourney V7 (released April 2025) is good enough at hands that you’ll rarely need fixes for simple poses. Complex interlaced fingers or holding small objects? Still tricky, but the base quality is high enough that 8 out of 10 generations work without intervention.
Start with upscaling. Mask with context. Denoise at 0.7. Doesn’t work in three tries? Regenerate.