Update README.md
Browse files
README.md
CHANGED
|
@@ -4,19 +4,21 @@ emoji: π¨
|
|
| 4 |
colorFrom: blue
|
| 5 |
colorTo: purple
|
| 6 |
sdk: gradio
|
| 7 |
-
sdk_version: 5.
|
| 8 |
app_file: app.py
|
| 9 |
pinned: false
|
|
|
|
|
|
|
| 10 |
---
|
| 11 |
|
| 12 |
# Marketing Image Generator with Agent Review
|
| 13 |
|
| 14 |
-
A sophisticated AI-powered image generation system that creates high-quality marketing images with automated quality review and refinement. Built on modern AI technologies including Google's
|
| 15 |
|
| 16 |
## Features
|
| 17 |
|
| 18 |
- **AI-Powered Image Generation**: Create stunning marketing images from text prompts using Google's Imagen4 via MCP server
|
| 19 |
-
- **Automated Quality Review**: Intelligent Gemini agent
|
| 20 |
- **Marketing-Focused**: Optimized for marketing materials, social media, and promotional content
|
| 21 |
- **Real-time Feedback**: Get instant quality scores and improvement suggestions
|
| 22 |
- **Professional Workflow**: Streamlined process from concept to final image
|
|
@@ -56,11 +58,11 @@ A sophisticated AI-powered image generation system that creates high-quality mar
|
|
| 56 |
|
| 57 |
### Core Components
|
| 58 |
|
| 59 |
-
- **Agent 1 (Image Generator)**: Creates images using Google's
|
| 60 |
- **Agent 2 (Marketing Reviewer)**: Analyzes image quality and provides marketing-focused feedback using Gemini Vision
|
| 61 |
- **Orchestrator**: Manages workflow between agents and handles handover
|
| 62 |
- **Web Interface**: Gradio-based user interface optimized for Hugging Face
|
| 63 |
-
- **MCP Server Integration**: Model Context Protocol for seamless
|
| 64 |
|
| 65 |
### System Architecture and Workflow
|
| 66 |
|
|
@@ -73,18 +75,18 @@ A sophisticated AI-powered image generation system that creates high-quality mar
|
|
| 73 |
βReviewer βββββΆβ βββββΆβ Agent 2 (Gemini) Marketing β
|
| 74 |
βPrompt β β β β Reviewer β
|
| 75 |
β β β β β β
|
| 76 |
-
β β β β β
|
| 77 |
-
β β β β β β
|
| 78 |
-
β β β β β β
|
| 79 |
-
β β β β β β Draft Image Creation
|
| 80 |
-
β β β β β
|
| 81 |
β β β β β β
|
| 82 |
-
β β β β β
|
| 83 |
-
β β β β β β
|
| 84 |
-
β β β β β β & Changes Suggested
|
| 85 |
-
β β β β β
|
| 86 |
β β β β β β
|
| 87 |
-
β Image ββββββ ββββββ Final Image Response
|
| 88 |
β Response β β β β β
|
| 89 |
βββββββββββββββ βββββββββββββββ βββββββββββββββββββββββββββββββ
|
| 90 |
```
|
|
@@ -104,12 +106,12 @@ A sophisticated AI-powered image generation system that creates high-quality mar
|
|
| 104 |
|
| 105 |
3. **Image Generation and Drafting (Top Right)**:
|
| 106 |
- **Agent 1 (Gemini) Drafter**: Receives Image Prompt, orchestrates image generation
|
| 107 |
-
- **
|
| 108 |
|
| 109 |
4. **Marketing Review and Refinement (Bottom Right)**:
|
| 110 |
- **Agent 2 (Gemini) Marketing Reviewer**: Receives Reviewer Prompt, evaluates generated image against marketing criteria
|
| 111 |
- **Draft Image Reviewed and Changes Suggested**: Agent 2's review process output
|
| 112 |
-
- **Iterative Refinement Loop**: Bidirectional feedback between Agent 2 and
|
| 113 |
- Final **Image Response** sent back to Gradio UI
|
| 114 |
|
| 115 |
### Summary of Flow:
|
|
@@ -117,12 +119,12 @@ User provides prompts β Gradio UI β Agent 1 drafts image with Imagen4 β Ag
|
|
| 117 |
|
| 118 |
### Technology Stack
|
| 119 |
|
| 120 |
-
- **AI Models**: Google Imagen4 (via MCP), Gemini Vision
|
| 121 |
- **Framework**: Gradio (Web Interface)
|
| 122 |
- **Orchestration**: Custom agent handover system
|
| 123 |
- **Deployment**: Hugging Face Spaces
|
| 124 |
- **Authentication**: Google Cloud API Keys
|
| 125 |
-
- **Protocol**: MCP (Model Context Protocol) for
|
| 126 |
|
| 127 |
### Why A2A Was Not Applied
|
| 128 |
|
|
@@ -179,7 +181,7 @@ quality_score = result["data"]["review"]["quality_score"]
|
|
| 179 |
- **Quality Threshold**: Minimum quality score for auto-approval
|
| 180 |
- **Max Iterations**: Maximum refinement attempts
|
| 181 |
- **Review Settings**: Customize review criteria
|
| 182 |
-
- **MCP Configuration**:
|
| 183 |
|
| 184 |
## Development
|
| 185 |
|
|
@@ -268,12 +270,52 @@ Access monitoring dashboards:
|
|
| 268 |
1. **API Key Errors**: Ensure your Google API keys are valid and configured as HF secrets
|
| 269 |
2. **Image Generation Fails**: Check your internet connection and API quotas
|
| 270 |
3. **Review Not Working**: Verify the Gemini agent is running and configured correctly
|
| 271 |
-
4. **MCP Connection Issues**: Check
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 272 |
|
| 273 |
### Debug Mode
|
| 274 |
|
| 275 |
Enable debug logging by setting `LOG_LEVEL=DEBUG` in your environment variables.
|
| 276 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 277 |
### Support
|
| 278 |
|
| 279 |
For issues and questions:
|
|
@@ -287,7 +329,7 @@ This project is licensed under the MIT License - see the LICENSE file for detail
|
|
| 287 |
|
| 288 |
## Acknowledgments
|
| 289 |
|
| 290 |
-
- Google AI for Imagen4 and Gemini technologies
|
| 291 |
- Hugging Face for the deployment platform
|
| 292 |
- Gradio for the web interface framework
|
| 293 |
-
- The open-source community for various dependencies
|
|
|
|
| 4 |
colorFrom: blue
|
| 5 |
colorTo: purple
|
| 6 |
sdk: gradio
|
| 7 |
+
sdk_version: 5.38.2
|
| 8 |
app_file: app.py
|
| 9 |
pinned: false
|
| 10 |
+
license: mit
|
| 11 |
+
short_description: AI marketing image generator with GCP Imagen4 + Gemini 2.5
|
| 12 |
---
|
| 13 |
|
| 14 |
# Marketing Image Generator with Agent Review
|
| 15 |
|
| 16 |
+
A sophisticated AI-powered image generation system that creates high-quality marketing images with automated quality review and refinement. Built on modern AI technologies including Google's Imagen4 and Gemini 2.5 Pro with advanced agent orchestration.
|
| 17 |
|
| 18 |
## Features
|
| 19 |
|
| 20 |
- **AI-Powered Image Generation**: Create stunning marketing images from text prompts using Google's Imagen4 via MCP server
|
| 21 |
+
- **Automated Quality Review**: Intelligent Gemini agent automatically reviews and refines generated images
|
| 22 |
- **Marketing-Focused**: Optimized for marketing materials, social media, and promotional content
|
| 23 |
- **Real-time Feedback**: Get instant quality scores and improvement suggestions
|
| 24 |
- **Professional Workflow**: Streamlined process from concept to final image
|
|
|
|
| 58 |
|
| 59 |
### Core Components
|
| 60 |
|
| 61 |
+
- **Agent 1 (Image Generator)**: Creates images using Google's Imagen4 via MCP server integration
|
| 62 |
- **Agent 2 (Marketing Reviewer)**: Analyzes image quality and provides marketing-focused feedback using Gemini Vision
|
| 63 |
- **Orchestrator**: Manages workflow between agents and handles handover
|
| 64 |
- **Web Interface**: Gradio-based user interface optimized for Hugging Face
|
| 65 |
+
- **MCP Server Integration**: Model Context Protocol for seamless Imagen4 access
|
| 66 |
|
| 67 |
### System Architecture and Workflow
|
| 68 |
|
|
|
|
| 75 |
βReviewer βββββΆβ βββββΆβ Agent 2 (Gemini) Marketing β
|
| 76 |
βPrompt β β β β Reviewer β
|
| 77 |
β β β β β β
|
| 78 |
+
β β β β β βββββββββββββββββββββββββββ β
|
| 79 |
+
β β β β β β Imagen4 (via MCP) β β
|
| 80 |
+
β β β β β β β β
|
| 81 |
+
β β β β β β Draft Image Creation β β
|
| 82 |
+
β β β β β βββββββββββββββββββββββββββ β
|
| 83 |
β β β β β β
|
| 84 |
+
β β β β β βββββββββββββββββββββββββββ β
|
| 85 |
+
β β β β β β Draft Image Reviewed β β
|
| 86 |
+
β β β β β β & Changes Suggested β β
|
| 87 |
+
β β β β β βββββββββββββββββββββββββββ β
|
| 88 |
β β β β β β
|
| 89 |
+
β Image ββββββ ββββββ Final Image Response β
|
| 90 |
β Response β β β β β
|
| 91 |
βββββββββββββββ βββββββββββββββ βββββββββββββββββββββββββββββββ
|
| 92 |
```
|
|
|
|
| 106 |
|
| 107 |
3. **Image Generation and Drafting (Top Right)**:
|
| 108 |
- **Agent 1 (Gemini) Drafter**: Receives Image Prompt, orchestrates image generation
|
| 109 |
+
- **Imagen4 (via MCP)**: Agent 1 interacts with Imagen4 through MCP server to create initial image draft
|
| 110 |
|
| 111 |
4. **Marketing Review and Refinement (Bottom Right)**:
|
| 112 |
- **Agent 2 (Gemini) Marketing Reviewer**: Receives Reviewer Prompt, evaluates generated image against marketing criteria
|
| 113 |
- **Draft Image Reviewed and Changes Suggested**: Agent 2's review process output
|
| 114 |
+
- **Iterative Refinement Loop**: Bidirectional feedback between Agent 2 and Imagen4 (via Agent 1) to refine image until it meets marketing standards
|
| 115 |
- Final **Image Response** sent back to Gradio UI
|
| 116 |
|
| 117 |
### Summary of Flow:
|
|
|
|
| 119 |
|
| 120 |
### Technology Stack
|
| 121 |
|
| 122 |
+
- **AI Models**: Google Imagen4 (via MCP), Gemini 2.5 Pro Vision
|
| 123 |
- **Framework**: Gradio (Web Interface)
|
| 124 |
- **Orchestration**: Custom agent handover system
|
| 125 |
- **Deployment**: Hugging Face Spaces
|
| 126 |
- **Authentication**: Google Cloud API Keys
|
| 127 |
+
- **Protocol**: MCP (Model Context Protocol) for Imagen4 integration
|
| 128 |
|
| 129 |
### Why A2A Was Not Applied
|
| 130 |
|
|
|
|
| 181 |
- **Quality Threshold**: Minimum quality score for auto-approval
|
| 182 |
- **Max Iterations**: Maximum refinement attempts
|
| 183 |
- **Review Settings**: Customize review criteria
|
| 184 |
+
- **MCP Configuration**: Imagen4 server settings
|
| 185 |
|
| 186 |
## Development
|
| 187 |
|
|
|
|
| 270 |
1. **API Key Errors**: Ensure your Google API keys are valid and configured as HF secrets
|
| 271 |
2. **Image Generation Fails**: Check your internet connection and API quotas
|
| 272 |
3. **Review Not Working**: Verify the Gemini agent is running and configured correctly
|
| 273 |
+
4. **MCP Connection Issues**: Check Imagen4 server connectivity and configuration
|
| 274 |
+
|
| 275 |
+
### Content Policy & Brand Restrictions
|
| 276 |
+
|
| 277 |
+
Google's AI models have built-in safety guardrails that may cause timeouts or rejections for certain content types:
|
| 278 |
+
|
| 279 |
+
#### π« **Highly Restricted Content** (Likely to cause stalls/timeouts):
|
| 280 |
+
- **Political Figures**: Named world leaders, politicians (e.g., "Putin", "Zelensky", "Biden")
|
| 281 |
+
- **Political Buildings**: Government buildings like "10 Downing Street", "White House"
|
| 282 |
+
- **Geopolitical Content**: War, conflict, or sensitive international relations
|
| 283 |
+
- **Financial Institution Brands**: Major banks like "HSBC", "Bank of America", "JPMorgan"
|
| 284 |
+
|
| 285 |
+
#### β οΈ **Moderately Restricted Content** (May cause delays):
|
| 286 |
+
- **Regulated Industries**: Healthcare, pharmaceutical, financial services
|
| 287 |
+
- **Some Corporate Brands**: Varies by sector and brand sensitivity
|
| 288 |
+
|
| 289 |
+
#### β
**Generally Permitted Content**:
|
| 290 |
+
- **Technology Brands**: "Cognizant", "Microsoft", "IBM", "Accenture"
|
| 291 |
+
- **Generic Business**: "Professional office", "corporate environment"
|
| 292 |
+
- **Non-branded Content**: Generic descriptions without specific brand names
|
| 293 |
+
|
| 294 |
+
#### π§ **Workarounds for Restricted Content**:
|
| 295 |
+
|
| 296 |
+
**Instead of**: `"Professional boardroom with HSBC signage"`
|
| 297 |
+
**Use**: `"Professional boardroom with international banking corporation signage in red and white colors"`
|
| 298 |
+
|
| 299 |
+
**Instead of**: `"Meeting with political leaders"`
|
| 300 |
+
**Use**: `"Meeting with business executives in government-style building"`
|
| 301 |
+
|
| 302 |
+
**Strategy**: Move brand-specific requirements to **Review Guidelines** instead of the main prompt:
|
| 303 |
+
- **Main Prompt**: `"Professional corporate environment"`
|
| 304 |
+
- **Review Guidelines**: `"Ensure branding reflects HSBC corporate colors (red and white)"`
|
| 305 |
+
|
| 306 |
+
This approach bypasses content filters while still providing guidance for review.
|
| 307 |
|
| 308 |
### Debug Mode
|
| 309 |
|
| 310 |
Enable debug logging by setting `LOG_LEVEL=DEBUG` in your environment variables.
|
| 311 |
|
| 312 |
+
### Content Policy Testing
|
| 313 |
+
|
| 314 |
+
Use the included diagnostic scripts to test content restrictions:
|
| 315 |
+
- `debug_hsbc_prompt.py` - Test financial brand restrictions
|
| 316 |
+
- `test_cognizant_brand.py` - Test tech brand accessibility
|
| 317 |
+
- `test_brand_workaround.py` - Test workaround strategies
|
| 318 |
+
|
| 319 |
### Support
|
| 320 |
|
| 321 |
For issues and questions:
|
|
|
|
| 329 |
|
| 330 |
## Acknowledgments
|
| 331 |
|
| 332 |
+
- Google AI for Imagen4 and Gemini 2.5 Pro technologies
|
| 333 |
- Hugging Face for the deployment platform
|
| 334 |
- Gradio for the web interface framework
|
| 335 |
+
- The open-source community for various dependencies
|