How AI Virtual Staging Actually Works (in Plain English)
Published 2026-04-14 · Category Product
Virtual staging via AI is not magic — it's instruction-based image editing. Here's what it can do, what it can't, and why the quality gap between tools is so large.
AI virtual staging, under the hood
Most "AI virtual staging" tools you see advertised today fall into two technical camps.
Camp 1: Free-form text-to-image
You describe a scene ("modern minimalist living room with linen sofa"), the model draws one from scratch. The output looks great — but it has nothing to do with your actual room. Walls, windows, camera angle all change. You can't use the result on an MLS listing because it's no longer a photo of the property.
Camp 2: Image-to-image with a preserve instruction
This is what we do. The AI is given:
1. The original photo (via a public URL)
2. A text instruction describing the style change
3. A strong preserve clause — explicit language telling the model not to touch the walls, windows, doors, floor, ceiling, or camera angle
The model then restyles only the furniture and decor, leaving the architecture intact. The result is the same room, restaged.
Why preserve clauses matter
Models are trained to be creative. Without explicit instructions, they will "improve" walls, widen a window, add a skylight — anything they think looks better. Our prompt pipeline tells the model exactly which elements are off-limits and which are fair game.
The strength of the preserve clause is tunable:
- 100% preserve: Keep walls, windows, doors, floor, and the existing furniture layout. Only swap accessories (cushions, rugs, art, plants).
- 60% preserve (default): Keep walls, windows, doors, floor, ceiling. Replace all furniture.
- 30% preserve: Keep only the room shape and camera angle. Freely restyle the entire interior.
What AI still gets wrong
- Specular reflections — Chrome and glass sometimes look off.
- Text on signs / book spines — The model invents illegible pseudo-text.
- Hands, faces — If your photo has people, they get mangled. We recommend empty-room photography.
- Very dim scenes — The model sometimes over-brightens. Daylight shots work best.
How we validate quality automatically
After every generation, we run a CLIP similarity score between the original and the output. Outputs that drift too far (below 0.5) are automatically retried with a stronger preserve clause. Outputs that didn't change at all (above 0.95) are retried with a weaker preserve clause. This is why the visible variance between our outputs and "one-shot" tools is so noticeable.
What this means for you as a user
- Use wider shots. The model does better with context.
- Choose the use case honestly (staging / restyle / soft restyle). The preserve clause is tied to it.
- If a result looks wrong, retry — it costs you nothing on paid plans.
FAQ
Does it work on occupied rooms?
Yes. Use the Restyle or Soft Restyle modes. Restyle swaps out all furniture; Soft Restyle keeps the big pieces and only swaps accessories.
Can it add a fireplace / windows that don't exist?
Not reliably, and not with our preserve clauses. AB 723 and most MLS rules also require that enhanced photos represent the actual property — adding features would fail that test.
What image resolution should I upload?
Wider is better up to 2048px on the long edge (we downscale anything larger). Under 1024px the model struggles with detail. JPEG or PNG both work.
How long does it take?
Average 30 seconds, p95 under 60 seconds. Paid users go on a priority queue.