Veo Model Image-to-Video API Documentation¶
Veo is a high-quality image-to-video generation model developed by Google. This document describes the complete API interface specification for using Google Veo model for image-to-video generation. All video generation calls use the same /v1/video/generations endpoint, with different parameters depending on the use case. Image data is provided as a base64-encoded string.
Overview¶
The Veo model image-to-video feature provides an asynchronous task processing mechanism:
- Submit Task: Send an image and text prompt to create a video generation task
- Query Status: Query generation progress and status through task ID
- Get Results: Retrieve the generated video file after task completion
Task Status Flow¶
- queued: Task has been submitted and is waiting to be processed
- in_progress: Task is being processed
- completed: Task completed successfully, video has been generated
- failed: Task failed
API List¶
| Method | Path | Description |
|---|---|---|
| POST | /v1/video/generations | Submit video generation task (standard format) |
| GET | /v1/video/generations/{task_id} | Query task status (standard format) |
| POST | /v1/videos | Submit video generation task |
| GET | /v1/videos/{task_id} | Query task status |
| GET | /v1/videos/{task_id}/content | Get video content (streaming download) |
Usage Examples¶
1. Basic Image-to-Video¶
The simplest form of image-to-video generation uses a single image as the first frame.
Request Body:
{
"model": "veo-3.0-fast-generate-001",
"prompt": "A cat playing piano in a beautiful garden",
"image": "<BASE64_ENCODED_IMAGE_DATA>",
"metadata": {}
}
2. First and Last Frames¶
The image in the image field specifies the first frame of the video. The image in metadata.lastFrame specifies the last frame. This allows you to control both the starting and ending frames of the generated video.
Note: This feature is only supported by Veo 3.1 models.
Request Body:
{
"model": "veo-3.0-fast-generate-001",
"prompt": "A cat playing piano in a beautiful garden",
"image": "<BASE64_ENCODED_IMAGE_DATA>",
"metadata": {
"lastFrame": "<BASE64_ENCODED_IMAGE_DATA>"
}
}
3. Reference Images¶
Images are specified in an array in metadata.referenceImages, containing up to 3 elements. Each reference image is an object containing image: base64-encoded image data and referenceType: a string with value "asset" or "style".
Note: This feature is only supported by veo-3.1-generate-preview.
Request Body:
{
"model": "veo-3.0-fast-generate-001",
"prompt": "A cat playing piano in a beautiful garden",
"image": "<BASE64_ENCODED_IMAGE_DATA>",
"metadata": {
"referenceImages": [
{
"image": "<BASE64_ENCODED_IMAGE_DATA>",
"referenceType": "asset"
},
{
"image": "<BASE64_ENCODED_IMAGE_DATA>",
"referenceType": "style"
}
]
}
}
Request Parameters:¶
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | Model name, e.g., veo-3.0-fast-generate-001 |
| prompt | string | Yes | Text prompt describing the video content to be generated |
| image | string | Yes | Base64-encoded image data for the first frame |
| metadata | object | No | Extended parameters object |
metadata Parameters:¶
| Parameter | Type | Required | Description |
|---|---|---|---|
| aspectRatio | string | No | Video aspect ratio, options: "16:9", "9:16" |
| durationSeconds | number | No | Video duration (seconds), options: 4, 6, 8 |
| negativePrompt | string | No | Negative prompt describing content not desired in the video |
| personGeneration | string | No | Person generation strategy, options: "allow_all" (text-to-video), "allow_adult" (image-to-video) |
| resolution | string | No | Video resolution, e.g., "1080p", "720p" |
| sampleCount | number | No | Number of videos to generate, default 1 |
| storageUri | string | No | Google Cloud Storage URI for storing generated videos |
| lastFrame | string | No | Base64-encoded image data for the last frame (Veo 3.1 models only) |
| referenceImages | array | No | Array of reference images, up to 3 elements (veo-3.1-generate-preview only) |
referenceImages Array Elements:¶
| Parameter | Type | Required | Description |
|---|---|---|---|
| image | string | Yes | Base64-encoded image data |
| referenceType | string | Yes | Reference type, options: "asset" or "style" |
1. Submit Video Generation Task¶
Complete Request:¶
curl -X POST "https://computevault.unodetech.xyz/v1/video/generations" -H "Content-Type: application/json" -H "Authorization: Bearer API_KEY" -d @veoImageToVideoTest.json
Endpoint:¶
Request Headers:¶
| Parameter | Type | Required | Description |
|---|---|---|---|
| Content-Type | string | Yes | application/json |
| Authorization | string | Yes | Bearer API_KEY |
Response Example:¶
Response Field Descriptions:¶
| Field | Type | Description |
|---|---|---|
| task_id | string | Task ID for subsequent task status queries |
2. Query Task Status¶
Complete Standard Format Endpoint¶
curl -X GET "https://computevault.unodetech.xyz/v1/video/generations/TASK_ID" -H "Authorization: Bearer API_KEY"
Endpoint:¶
Request Headers:¶
| Parameter | Type | Required | Description |
|---|---|---|---|
| Authorization | string | Yes | Bearer API_KEY |
Path Parameters:¶
| Parameter | Type | Required | Description |
|---|---|---|---|
| task_id | string | Yes | Task ID |
Response Example (Processing):¶
{
"code": "success",
"message": "",
"data": {
"bytes_base64_encoded": "",
"error": null,
"format": "mp4",
"metadata": null,
"status": "processing",
"task_id": "TASK_ID",
"url": ""
}
}
Response Example (Success):¶
{
"code": "success",
"message": "",
"data": {
"bytes_base64_encoded": "",
"error": null,
"format": "mp4",
"metadata": null,
"status": "succeeded",
"task_id": "TASK_ID",
"url": "https://computevault.unodetech.xyz/v1/videos/TASK_ID/content"
}
}
Note: Depending on the AI service provider, the video will be returned either as base64-encoded data in the bytes_base64_encoded field (Vertex) or via a content URL in the url field (Gemini).
Response Example (Failed):¶
{
"code": "success",
"message": "",
"data": {
"bytes_base64_encoded": "",
"error": null,
"format": "mp4",
"metadata": null,
"status": "failed",
"task_id": "TASK_ID",
"url": "Reference to video does not support this mix of reference images."
}
}
When a task fails, the url field contains the error message instead of a video URL.
Response Field Descriptions:¶
| Field | Type | Description |
|---|---|---|
| code | string | Response status code, "success" indicates success |
| data | object | Task data object |
| data.task_id | string | Task ID |
| data.status | string | Task status: queued, in_progress, succeeded, failed |
| data.format | string | Video format, e.g., "mp4" |
| data.url | string | Video access URL (when task succeeds), or error message (when task fails) |
| data.bytes_base64_encoded | string | Base64-encoded video data (when available) |
| data.error | object | Error information (when task fails) |
| message | string | Error message |