The Complete AI Image Generation Workflow: From Prompt to Production-Ready Asset
A practical workflow for creating consistent, high-quality AI images using Midjourney, Flux, and ComfyUI. Covers prompting, upscaling, editing, and batch production.
Making one good AI image is easy. Making twenty consistent, brand-aligned, production-ready images for a real project is hard. This guide covers the complete workflow — from initial concept to final delivery — that professional AI artists and creative teams use in 2026.
We’ll cover three generation tools (Midjourney v7, Flux 1.1 Pro, and ComfyUI with SDXL/Flux), because no single tool is best for every use case. The workflow principles apply regardless of which tool you choose.
The Workflow Overview
1. Brief → 2. Style Reference → 3. Prompt Engineering → 4. Generation
↓ ↓
5. Selection → 6. Upscaling → 7. Editing → 8. Format Export
Most people jump straight to step 4 (generation) and wonder why their results are inconsistent. The preparation steps (1-3) determine 80% of the output quality.
Step 1: Define the Brief
Before touching any AI tool, answer these questions:
## Image Brief Template
**Purpose:** What is this image for? (blog header, social media, product mockup, etc.)
**Dimensions:** What aspect ratio and resolution do you need?
**Style:** What visual style? (photorealistic, illustration, 3D render, flat design)
**Mood:** What emotion should it convey? (professional, playful, dramatic, calm)
**Color Palette:** Any brand colors or restrictions?
**Subject:** What must be in the image?
**Exclusions:** What must NOT be in the image?
**Consistency:** Does this need to match existing images? (if so, attach references)
**Quantity:** How many variations do you need?
A clear brief prevents the most common waste: generating 50 images and using none because they don’t match what was actually needed.
Step 2: Build a Style Reference Library
Consistency across multiple images requires style references. Collect 5-10 reference images that represent the visual language you want.
For Midjourney: Style Reference (—sref)
Midjourney v7’s --sref parameter lets you provide a URL to a style reference image:
A modern office workspace with natural lighting --sref https://example.com/your-style-ref.jpg --sw 100 --ar 16:9
--sref: URL to your style reference image--sw 100: Style weight (0-1000, higher = stronger style influence)--ar 16:9: Aspect ratio
For a series of images, use the same --sref across all generations. This is the simplest path to visual consistency.
For Flux: IP-Adapter and ControlNet
Flux doesn’t have a built-in style reference parameter, but through ComfyUI, you can use IP-Adapter to inject style:
# ComfyUI workflow nodes:
1. Load Flux model
2. Load IP-Adapter model
3. Load style reference image
4. IP-Adapter Apply (strength: 0.6-0.8)
5. KSampler (steps: 28, cfg: 3.5, scheduler: euler)
6. VAE Decode
7. Save Image
The IP-Adapter strength is crucial: 0.4-0.6 gives a subtle style influence; 0.8-1.0 closely mimics the reference. Start at 0.6 and adjust.
Step 3: Prompt Engineering for Images
Image prompts are fundamentally different from text prompts. Here’s the structure that works:
The Anatomy of an Effective Image Prompt
[Subject] + [Environment/Setting] + [Style/Medium] + [Lighting] + [Camera/Composition] + [Mood/Atmosphere] + [Quality Modifiers]
Example:
A senior software engineer working at a standing desk with three monitors,
modern minimalist office with floor-to-ceiling windows showing a city skyline,
editorial photography style, soft natural window light with subtle rim lighting,
shot from a slight low angle with shallow depth of field,
focused and confident atmosphere,
8K, ultra-detailed, professional color grading
Prompt Component Library
Build a reusable library of prompt components:
Style modifiers:
Photorealistic: "editorial photography, DSLR, 85mm lens"
Illustration: "digital illustration, clean linework, vibrant colors"
3D Render: "3D render, octane render, subsurface scattering"
Flat Design: "flat design, vector art, geometric shapes, bold colors"
Cinematic: "cinematic, anamorphic lens, film grain, color graded"
Lighting modifiers:
Natural: "soft natural light, golden hour, window light"
Studio: "studio lighting, three-point lighting, softbox"
Dramatic: "dramatic chiaroscuro lighting, strong shadows"
Neon: "neon lighting, cyberpunk, reflective surfaces"
Ambient: "ambient occlusion, soft diffused light, overcast"
Composition modifiers:
Portrait: "close-up, shallow depth of field, bokeh background"
Wide: "wide angle, establishing shot, environmental"
Overhead: "top-down, bird's eye view, flat lay"
Isometric: "isometric view, 30-degree angle, technical illustration"
Step 4: Generation — Tool Selection
When to Use Midjourney v7
Best for:
- Quick concept exploration (30 seconds per generation)
- Photorealistic images of people and environments
- Art direction with style references
- Non-technical users who prefer a chat interface
/imagine A minimalist SaaS dashboard UI with analytics charts and a clean
sidebar navigation, on a MacBook Pro screen sitting on a wooden desk,
soft natural lighting from the left, product photography style,
shallow depth of field --ar 16:9 --sref [your-style-url] --v 7
When to Use Flux 1.1 Pro
Best for:
- Text rendering in images (Flux handles text better than any other model)
- Precise prompt following (Flux is more literal than Midjourney)
- API-based batch generation
- Integration into automated pipelines
import replicate
output = replicate.run(
"black-forest-labs/flux-1.1-pro",
input={
"prompt": "A professional blog header image showing a laptop with code on screen, minimalist desk setup, soft gradient background in deep blue to purple, the text 'AI Engineering' visible on the screen, editorial photography style",
"aspect_ratio": "16:9",
"output_format": "webp",
"output_quality": 90,
"num_inference_steps": 28,
}
)
When to Use ComfyUI
Best for:
- Maximum control over every generation parameter
- Custom workflows with ControlNet, IP-Adapter, inpainting
- Batch processing with consistent settings
- Running locally (free, no API costs)
ComfyUI has a steep learning curve but offers unmatched flexibility. A typical production workflow:
Load Checkpoint → CLIP Text Encode → KSampler → VAE Decode → Upscale → Save
Additional nodes for production:
- ControlNet (for pose/composition control)
- IP-Adapter (for style consistency)
- Face Detailer (for face quality)
- Inpainting (for selective edits)
Step 5: Selection and Curation
Generate 4-8 variations per concept. Evaluate each against your brief:
## Image Evaluation Checklist
□ Does it match the brief's subject requirements?
□ Is the style consistent with references?
□ Are there any anatomical errors (hands, faces)?
□ Is the composition balanced?
□ Does the color palette match brand guidelines?
□ Are there any artifacts or visual glitches?
□ Is text (if any) readable and correctly spelled?
□ Would this look professional in the final context?
Be ruthless. It’s faster to generate a new batch than to fix a fundamentally flawed image.
Step 6: Upscaling
Most AI generators output at 1024x1024 or similar resolutions. Production assets typically need 2x-4x more resolution.
Recommended Upscaling Pipeline
1. AI Generation (1024x1024)
↓
2. First upscale: Topaz Photo AI or Real-ESRGAN (4x → 4096x4096)
↓
3. Optional: Magnific AI for creative detail enhancement
↓
4. Final resize to target dimensions
Using ComfyUI, you can integrate upscaling into the generation workflow:
KSampler (1024x1024) → Upscale Latent (2x) → KSampler (high-res fix, 0.4 denoise) → VAE Decode → Save (2048x2048)
The high-res fix approach (upscale in latent space, then re-denoise at low strength) adds coherent detail rather than just scaling pixels.
Step 7: Post-Processing
Photoshop / GIMP Fixes
Even the best AI images need minor fixes:
- Color correction to match brand palette
- Cropping to exact required dimensions
- Spot removal for minor artifacts
- Background cleanup for product images
Batch Processing with ImageMagick
For bulk post-processing:
# Resize all images to 1200x630 (blog headers)
for img in generated/*.webp; do
magick "$img" -resize 1200x630^ -gravity center -extent 1200x630 \
-quality 85 "output/$(basename $img)"
done
# Add consistent color grading (warm tone)
for img in generated/*.webp; do
magick "$img" -modulate 100,95,102 -brightness-contrast 2x5 \
-quality 85 "output/$(basename $img)"
done
# Convert to multiple formats
for img in output/*.webp; do
base=$(basename "$img" .webp)
magick "$img" -quality 85 "output/${base}.jpg"
magick "$img" -resize 600x315 "output/${base}-thumb.webp"
done
Step 8: Format and Export
Different platforms require different formats:
| Platform | Format | Dimensions | Max Size |
|---|---|---|---|
| Blog header | WebP | 1200x630 | 200KB |
| Twitter/X | PNG/JPG | 1200x675 | 5MB |
| JPG | 1080x1080 | 30MB | |
| PNG | 1200x627 | 10MB | |
| Thumbnail | WebP | 400x225 | 50KB |
| Print (A4) | TIFF/PNG | 3508x2480 | N/A |
Always export at the exact dimensions needed. Don’t rely on the platform to resize — they’ll compress aggressively and degrade quality.
Building a Consistent Image System
For a project that needs 20+ consistent images (like a blog, course, or marketing campaign), systematize:
1. Create a Prompt Template
[SUBJECT]: {varies per image}
[STYLE]: editorial photography, soft natural lighting, shallow depth of field
[PALETTE]: deep navy (#1a1a2e), warm gold (#e8d5b5), clean white (#fafafa)
[MOOD]: professional, approachable, modern
[COMPOSITION]: rule of thirds, negative space on left for text overlay
[QUALITY]: 8K, ultra-detailed, professional color grading
2. Generate a Style Reference Image First
Create one “hero” image that defines the visual language. Use this as the --sref (Midjourney) or IP-Adapter reference (ComfyUI) for all subsequent generations.
3. Batch Generate with Variables
subjects = [
"AI chatbot interface on a laptop screen",
"data visualization dashboard with charts",
"team meeting with AI assistant on screen",
"developer coding with AI pair programming",
]
base_prompt = """
{subject}, modern minimalist office environment,
editorial photography, soft window lighting,
shallow depth of field, professional color grading,
clean and modern aesthetic
"""
for subject in subjects:
prompt = base_prompt.format(subject=subject)
generate_image(prompt, style_ref="hero-image.webp")
4. Quality Control Checklist
After each batch:
- Lay out all images side by side
- Check for visual consistency (same lighting, same color temperature)
- Identify outliers and regenerate
- Apply identical post-processing to all images
Common Mistakes
-
No style reference. Generating images without references produces random visual styles. Always have a reference.
-
Over-prompting. Cramming 200 words into a prompt doesn’t make it better. Focus on the most important visual elements.
-
Ignoring aspect ratio. Generating square images when you need 16:9 and then cropping wastes the most important parts of the composition.
-
Skipping upscaling. 1024x1024 looks fine on a phone but terrible as a blog header on a desktop monitor.
-
No batch consistency check. Looking at images individually instead of as a set. They might each look good but clash visually when displayed together.
The difference between amateur and professional AI image workflows isn’t the tool — it’s the process. Build the system, follow the steps, and your output will be consistent, on-brand, and production-ready every time.
Sources
> Want more like this?
Get the best AI insights delivered weekly.
> Related Articles
Web Scraping with AI: Build a Smart Data Extraction Pipeline
Traditional web scraping breaks when websites change layouts. AI-powered scraping understands page structure and extracts data intelligently. Here's how to build one using Python, Beautiful Soup, and Claude.
Create an AI Art Portfolio: From Generation to Gallery in One Weekend
Build a professional AI art portfolio website with curated collections, consistent style, and proper attribution. Covers prompt engineering, style consistency, curation, and deployment.
Build an AI Chrome Extension: Add Claude to Any Webpage in 60 Minutes
Build a Chrome extension that summarizes web pages, answers questions about content, and rewrites selected text — all powered by Claude. Full source code and step-by-step instructions included.
Tags
> Stay in the loop
Weekly AI tools & insights.