Visual Quality Control (The Exorcist) 🧛♀️
Zirelia includes an advanced AI-based quality control system to prevent posting uncanny, distorted, or "cursed" images.
1. The Image Critic (critic.py)
Every image generated by Replicate is analyzed by GPT-4o-mini (Vision) before it is posted. The Critic checks for: * Anatomical Horrors: Heads turned 180° (Exorcist style). * Hand Distortion: Too many fingers, claw hands, or impossible grips. * Face Melt: Distorted facial features.
Workflow
- Generate: Image is created via Replicate (FLUX).
- Verify: URL is sent to OpenAI Vision API with a strict prompt.
- PASS: Image is approved for posting.
- REJECT: Image is discarded.
- Retry: If rejected, the system waits 3 minutes (to reset API limits) and tries again (max 3 attempts).
2. Safety Injection (Prompt Optimization)
To save costs and reduce rejection rates, the system proactively modifies prompts that are known to cause issues.
The Problem: Hands & Cups
AI models struggle with hands holding objects like coffee cups. This often results in alien fingers or floating mugs.
The Solution: _optimize_prompt_for_safety
When the system detects keywords like coffee, cup, latte, or drink, it automatically injects a safety modifier to simplify the composition:
| Original Concept | Safety Injection (Randomized) | Result |
|---|---|---|
| "drinking coffee" | cup resting on table next to her |
No hands visible (Safe) |
| "holding a latte" | drinking from cup, close up face, hands out of frame |
Hands hidden (Safe) |
| "morning coffee" | holding cup with both hands, detailed fingers |
High Detail (Risky but better) |
This reduces the "hallucination rate" by avoiding complex hand-object interactions when possible.
Configuration
This feature is enabled automatically if OPENAI_API_KEY is present.
To disable it (not recommended), you can remove the key or modify core/image_gen/pipeline.py.
3. Creative Expansion (The Muse) 🎨
To avoid repetitive images (e.g., getting the same "Morning Coffee" shot every time), the system now uses an LLM to expand simple topics into unique, detailed visual descriptions before generation.
Example: * Original Topic: "Morning coffee" * Expanded Prompt: "Sienna sitting on a sunlit balcony wearing a silk robe, holding a ceramic mug with both hands, soft morning haze, ocean view in background, candid smile."
This ensures variety while maintaining the persona's vibe.