Wan Model Video-to-Video API Documentation¶
Wan/Alibaba Cloud provides high-quality video-to-video generation models. This document describes the complete API interface specification for using Wan/Alibaba Cloud models for video-to-video generation. The Wan video-to-video model uses the character and voice from an input video, combined with a prompt, to generate a new video that maintains character consistency.
Overview¶
Supported Models¶
Currently supported models include:
| Model | Description |
|---|---|
| wan2.6-r2v | Wan 2.6 video-to-video generation model |
The Wan model video-to-video feature provides an asynchronous task processing mechanism:
- Submit Task: Send reference videos and a text prompt to create a video generation task
- Query Status: Query generation progress and status through task ID
- Get Results: Retrieve the generated video file after task completion
Task Status Flow¶
- queued: Task has been submitted and is waiting to be processed
- in_progress: Task is being processed
- completed: Task completed successfully, video has been generated
- failed: Task failed
Features¶
- Basic features: You can select the video duration (5 or 10 seconds), specify the video resolution (720P or 1080P), and add watermarks
- Multi-shot narrative: You can generate videos with multiple shots while maintaining subject consistency across shot changes
API List¶
| Method | Path | Description |
|---|---|---|
| POST | /v1/video/generations | Submit video generation task |
| GET | /v1/video/generations/{task_id} | Query task status |
Usage Examples¶
1. Single-Character Reference¶
Reference the character's appearance and voice from a video, set shot_type to multi, and generate a multi-shot video.
Request Body:
{
"prompt": "character1 drinks bubble tea while dancing spontaneously to the music.",
"model": "wan2.6-r2v",
"metadata": {
"input": {
"reference_video_urls": [
"https://example.com/reference-video.mp4"
]
},
"parameters": {
"size": "1280*720",
"duration": 5,
"shot_type": "multi"
}
}
}
2. Multi-Character Reference¶
Based on reference videos for a character and a prop, define the relationship between them using a prompt, set shot_type to multi, and generate a multi-shot video. You can reference the same character multiple times in the prompt.
Request Body:
{
"prompt": "character1 and character2 talk to each other in an office.",
"model": "wan2.6-r2v",
"metadata": {
"input": {
"reference_video_urls": [
"https://example.com/character1-video.mp4",
"https://example.com/character2-video.mp4"
],
"negative_prompt": "white walls"
},
"parameters": {
"size": "1280*720",
"duration": 10,
"shot_type": "multi",
"watermark": true
}
}
}
Complete Request:
curl -X POST "https://computevault.unodetech.xyz/v1/video/generations" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer API_KEY" \
-d '{
"prompt": "character1 and character2 talk to each other in an office.",
"model": "wan2.6-r2v",
"metadata": {
"input": {
"reference_video_urls": [
"https://example.com/character1-video.mp4",
"https://example.com/character2-video.mp4"
],
"negative_prompt": "white walls"
},
"parameters": {
"size": "1280*720",
"duration": 10,
"shot_type": "multi",
"watermark": true
}
}
}'
Request Parameters:¶
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | Model name, must be wan2.6-r2v (currently the only supported model) |
| prompt | string | Yes | Text prompt describing the video content to be generated. In multi-character scenarios, you can use identifiers like character1, character2 to reference different reference videos |
| metadata | object | No | Metadata object containing input and parameters sub-objects for specifying optional fields from the official Wan request format |
metadata.input Parameters:¶
| Parameter | Type | Required | Description |
|---|---|---|---|
| reference_video_urls | array[string] | Yes | Array of reference video URLs. A maximum of 3 videos are supported. Multiple Video Usage: If you use multiple videos, the order of the URLs in the array defines the character order. The first URL corresponds to character1, the second to character2, and so on.Video Requirements: - Each reference video must contain only one character. For example, character1 is a little girl and character2 is an alarm clock - Format: MP4 or MOV - Duration: 2 to 30 seconds - File size: The video cannot exceed 100 MB - URLs support the HTTP or HTTPS protocol |
| negative_prompt | string | No | Negative prompt text to exclude certain elements from the video |
metadata.parameters Parameters:¶
| Parameter | Type | Required | Description |
|---|---|---|---|
| size | string | No | Video resolution. Default value is "1920*1080" (1080P). Options:720P tier: - "1280*720" (16:9)- "720*1280" (9:16)- "960*960" (1:1)- "1088*832" (4:3)- "832*1088" (3:4)1080P tier: - "1920*1080" (16:9, default)- "1080*1920" (9:16)- "1440*1440" (1:1)- "1632*1248" (4:3)- "1248*1632" (3:4) |
| duration | integer | No | Video duration in seconds. Options: 5, 10 |
| shot_type | string | No | Specifies the shot type of the generated video. Options: "single" (default, outputs a single-shot video) or "multi" (outputs a multi-shot video while maintaining subject consistency across shot changes) |
| watermark | boolean | No | Add watermark to the video |
| seed | integer | No | Random number seed. The value must be in the range of [0, 2147483647]. If you do not specify this parameter, the system automatically generates a random seed. To improve the reproducibility of the results, you can set a fixed seed value. Note that because model generation is probabilistic, using the same seed does not guarantee that the results are identical every time. Example: 12345 |
1. Submit Video Generation Task¶
Endpoint:¶
Request Headers:¶
| Parameter | Type | Required | Description |
|---|---|---|---|
| Content-Type | string | Yes | application/json |
| Authorization | string | Yes | Bearer API_KEY |
Response Example:¶
{
"id": "...",
"object": "video",
"model": "wan2.6-r2v",
"status": "queued",
"progress": 0,
"created_at": 1766086029
}
Response Field Descriptions:¶
| Field | Type | Description |
|---|---|---|
| id | string | Task ID for subsequent task status queries |
| object | string | Object type, fixed as "video" |
| model | string | Model used to generate the video |
| status | string | Task status, initially "queued" |
| progress | integer | Task progress, 0-100 |
| created_at | integer | Task creation timestamp |
2. Query Task Status¶
Complete Request¶
curl -X GET "https://computevault.unodetech.xyz/v1/video/generations/TASK_ID" \
-H "Authorization: Bearer API_KEY"
Endpoint:¶
Request Headers:¶
| Parameter | Type | Required | Description |
|---|---|---|---|
| Authorization | string | Yes | Bearer API_KEY |
Path Parameters:¶
| Parameter | Type | Required | Description |
|---|---|---|---|
| task_id | string | Yes | Task ID |
Response Example (Processing):¶
{
"code": "success",
"message": "",
"data": {
"task_id": "...",
"action": "textGenerate",
"status": "IN_PROGRESS",
"fail_reason": "",
"submit_time": 1766086029,
"start_time": 1766086038,
"finish_time": 0,
"progress": "30%",
"data": {
"output": {
"scheduled_time": "2025-12-19 03:27:09.887",
"submit_time": "2025-12-19 03:27:09.859",
"task_id": "...",
"task_status": "RUNNING"
},
"request_id": "..."
}
}
}
Response Example (Success):¶
{
"code": "success",
"message": "",
"data": {
"task_id": "...",
"action": "textGenerate",
"status": "SUCCESS",
"fail_reason": "<OUTPUT_URL>",
"submit_time": 1766086029,
"start_time": 1766086038,
"finish_time": 1766086419,
"progress": "100%",
"data": {
"output": {
"end_time": "2025-12-19 03:33:31.045",
"orig_prompt": "character1 and character2 talk to each other in an office.",
"scheduled_time": "2025-12-19 03:27:09.887",
"submit_time": "2025-12-19 03:27:09.859",
"task_id": "...",
"task_status": "SUCCEEDED",
"video_url": "<OUTPUT_URL>"
},
"request_id": "...",
"usage": {
"SR": 720,
"duration": 15,
"input_video_duration": 5,
"output_video_duration": 10,
"size": "1280*720",
"video_count": 1,
"video_ratio": "1280*720"
}
}
}
}
You can retrieve the video URL from the data.data.output.video_url field.
Response Example (Failed):¶
{
"code": "success",
"message": "",
"data": {
"task_id": "...",
"action": "textGenerate",
"status": "FAILURE",
"fail_reason": "task failed, code: InvalidParameter , message: The size is not match xxxxxx",
"submit_time": 1766086029,
"start_time": 1766086038,
"finish_time": 1766086419,
"progress": "100%",
"data": {
"output": {
"code": "InvalidParameter",
"end_time": "2025-12-19 03:33:31.045",
"message": "The size is not match xxxxxx",
"scheduled_time": "2025-12-19 03:27:09.887",
"submit_time": "2025-12-19 03:27:09.859",
"task_id": "...",
"task_status": "FAILED"
},
"request_id": "..."
}
}
}
Response Field Descriptions:¶
| Field | Type | Description |
|---|---|---|
| code | string | Response status code, "success" indicates success |
| message | string | Response message |
| data | object | Task data object |
| data.task_id | string | Task ID |
| data.status | string | Task status: IN_PROGRESS, SUCCESS, FAILURE |
| data.progress | string | Task progress percentage |
| data.data.output.video_url | string | Video access URL (when task succeeds). The link is valid for 24 hours |
| data.data.output.task_status | string | Task status: RUNNING, SUCCEEDED, FAILED |
| data.data.output.orig_prompt | string | Original input prompt |
| data.data.usage | object | Usage statistics (when task succeeds) |
| data.data.usage.input_video_duration | integer | Total duration of input reference videos in seconds |
| data.data.usage.output_video_duration | integer | Duration of output video in seconds, same as the value of parameters.duration |
| data.data.usage.duration | float | Total video duration in seconds, used for billing. Formula: duration = input_video_duration + output_video_duration |
| data.data.usage.SR | integer | Resolution of generated video, e.g., 720 |
| data.data.usage.video_ratio | string | Resolution of generated video, format "width*height", e.g., "1280*720" |
| data.data.usage.video_count | integer | Number of videos generated, fixed at 1 |
Important Notes¶
-
Data Retention: The task_id and video URL are retained for 24 hours. After this period, you can no longer query or download them.
-
Content Moderation: Both the input prompt and the output video undergo content moderation. Requests that contain prohibited content return an "IPInfringementSuspect" or "DataInspectionFailed" error.
-
Network Access Configuration: Video links are stored in Object Storage Service (OSS). If your business system cannot access external OSS links because of security policies, add the relevant OSS domain names to your network access whitelist.
-
Billing Description:
- You are billed per second based on the combined duration of the input video + output video
- You are charged only when the API call returns a
task_statusofSUCCEEDEDand a video is successfully generated - Failed model calls or processing errors do not incur any fees or consume the free quota
- Billable duration for input video: The billable duration is the sum of the truncated durations of each reference video. The total billable duration for the input cannot exceed 5 seconds
-
Billable duration for output video: The duration in seconds of the video successfully generated by the model
-
Reference Video Count: Supports 1-3 reference videos. Use 1 video for single-character scenarios, multiple videos for multi-character scenarios.