The Imagination (Image Generation)
[!NOTE] Current State (as of 2026-02-20): The only active image provider is FLUX 1 via Replicate (remote API). Local/offline generation is not supported at this time. DALL-E 3 is available as a fallback if
REPLICATE_API_TOKENis not set.
The "Imagination" module converts text prompts into photorealistic images. It is powered by FLUX.1 running on Replicate, augmented with custom LoRA (Low-Rank Adaptation) models for identity consistency.
Architecture
- Prompt Engineering: The system takes a basic description (e.g., "Sienna running on the beach") and enhances it with stylistic keywords ("cinematic lighting", "film grain", "f/1.8", "shot on 35mm").
- Gender Anchor: If using a custom LoRA, the system automatically injects "a young woman" into the prompt to prevent gender drift (a common issue with fine-tuned models).
- Generation: The enhanced prompt is sent to
replicate.com's API. - Download & Temp Storage: The resulting image URL is downloaded to a temporary local file for upload to social platforms.
FLUX.1 & LoRA
We use FLUX.1 [dev] or FLUX.1.1 [pro] as the base model because of its superior realism and text adherence compared to SDXL.
Custom LoRA (Sienna Fox)
To maintain the same face across thousands of images, we trained a LoRA adapter.
* Trigger Word: TOK (or specific token used during training).
* Training Data: 52 synthetic images generated via Midjourney/Flux with consistent facial features.
Key Files
core/image_gen/pipeline.py: TheImageGeneratorclass.