Create stunning images from text descriptions. Edit and manipulate visual content using natural language prompts.
Start Creating Advanced DocsConvert natural language descriptions into photorealistic or artistic images using Gemini's advanced visual generation capabilities.
{`curl -X POST https://api.gemini.com/v1/image \\ -H "Authorization: Bearer YOUR_API_KEY" \\ -H "Content-Type: application/json" \\ -d '{"prompt":"A cyberpunk cityscape at night with neon lights and holographic billboards"}'`}
Upload your base image and use natural language instructions to edit elements within the image.
Supported formats: JPEG, PNG, WebP (max 10MB)
Use natural language instructions to modify specific elements in your image.
{`{ "action": "Edit Image", "description": "Add a glowing hologram in the center of the image" }`}
Apply different artistic styles to your images with prompts like 'watercolor painting style' or 'modern digital art'.
Ask detailed questions about images: "What is the main subject in this photo?" or "Describe the lighting composition".
Automatically enhance lighting, color balance, and composition using Gemini's visual understanding.
Ask questions about any image to get detailed analysis and metadata:
{`curl -X POST https://api.gemini.com/v1/image \\ -H "Authorization: Bearer YOUR_API_KEY" \\ -H "Content-Type: multipart/form-data" \\ -F "image_file=@catphoto.jpg" \\ -F "prompt=What kind of cat is this?"`}
Example Response:
"This appears to be a Maine Coon cat with distinctive tabby patterns"
Complete documentation for the Gemini image generation and editing API endpoints. Includes detailed parameter lists, request formats, and response specifications.
View API DocsPython, Node.js, and other library integrations for visual content generation. Includes step-by-step coding examples for common use cases.
Visual Tutorials