Content Generation API (Video / Image / Audio / Music)

Generate video, image, audio (TTS) and music with Onysoft AI Gateway. Use 40+ provider models such as Veo, Sora 2, Runway, Kling, Wan, Hailuo, Seedream, Flux, GPT Image, Imagen, Nano Banana, Ideogram, Grok Imagine, ElevenLabs and Suno through a single asynchronous backend.

bolt

5 endpoints, 1 shared backend

There are 5 endpoint names for the generation flow, and they all connect to the same backend. Which one you use is purely semantic — it is meant to make your code more readable. The returned task_id is queried the same way for all of them.

Endpoint Names (Aliases)

The 5 endpoints below are fully equivalent. Whichever you send, the same backend runs and the same task_id format is returned. For semantic clarity, prefer the one that fits your model type.

Endpoint	Recommended Use	Example Models
`POST /v1/video/generate`	Video generation	veo3, sora-2, runway, kling, wan, hailuo
`POST /v1/image/generate`	Image generation	seedream, flux, imagen, nano-banana, ideogram, grok-imagine, gpt-image
`POST /v1/audio/generate`	Audio generation (general)	elevenlabs/text-to-speech, sound-effect, audio-isolation, speech-to-text
`POST /v1/voice/generate`	Speech generation (TTS)	elevenlabs/text-to-speech-multilingual-v2, turbo-2-5, v3-text-to-dialogue
`POST /v1/music/generate`	Music generation	suno-v4, ai-music-api/generate, mashup, extend, sounds

info

Status Lookup Endpoints

Whichever alias you used for generation, you can use the same alias for status lookup (or any of them — the task_id is global).

GET /v1/video/status/{task_id}
GET /v1/image/status/{task_id}
GET /v1/audio/status/{task_id}
GET /v1/voice/status/{task_id}
GET /v1/music/status/{task_id}

info

Asynchronous Design

All of these endpoints work asynchronously. When you send a request you receive a task_id (HTTP 202). You use this ID to query the generation status and check whether your content is ready. Typical duration: image 10-30s, video 30-180s, TTS 5-15s, music 30-90s.

bolt

Text/Chat Exception

For text/chat models, use /v1/chat/completions (synchronous, streaming-capable — see Chat Completions). Special case: Google Gemini image models (gemini-*-image) are called through the chat completions endpoint (multimodal response).

Generate Video

POST /v1/video/generate

Starts a new video, image or music generation task.

Parameters

model string required

The video model. Example: veo3_fast, kling-2.5, runway_gen4_turbo

prompt string required

Video description (English recommended). Detailed and descriptive prompts produce better results.

aspect_ratio string optional

Aspect ratio: "16:9", "9:16", "1:1". Default: "16:9"

duration integer optional

Video length (seconds). Default: 5. Depending on the model, 5, 6, 8 or 10 seconds are supported.

image_url string required for i2v/i2i

Source image. Required for image-to-video and image-to-image models. Can be a public HTTP URL or a base64 data URI.

image string optional

Alternative to image_url. A base64-encoded image. The system automatically saves it to a temp file and produces a public URL.

resolution string optional

Resolution. Supports different values per model: "720p", "1080p" (video), "1K", "2K", "4K" (image). The backend selects the correct value automatically.

quality string optional

Quality level. "high", "standard", "low". Used only with the gpt-image and seedream models.

negative_prompt string optional

Unwanted elements. Specify which object/style/color you do not want generated.

Example Requests

Video Generation (Text-to-Video)

                cURL
                
            

curl -X POST https://api.onysoft.com/v1/video/generate \
  -H "Authorization: Bearer sk-ony-YOUR-KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "veo3_fast",
    "prompt": "A golden retriever running through a field of sunflowers at sunset",
    "aspect_ratio": "16:9",
    "duration": 5
  }'
            

Image Generation (Text-to-Image)

                cURL
                
            

curl -X POST https://api.onysoft.com/v1/video/generate \
  -H "Authorization: Bearer sk-ony-YOUR-KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/imagen4",
    "prompt": "A beautiful sunset over mountains, photorealistic, 4K",
    "aspect_ratio": "16:9"
  }'
            

Image Editing (Image-to-Image)

                cURL
                
            

curl -X POST https://api.onysoft.com/v1/video/generate \
  -H "Authorization: Bearer sk-ony-YOUR-KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "seedream/4.5-edit",
    "prompt": "Change background to a futuristic city",
    "image_url": "https://example.com/input.jpg",
    "aspect_ratio": "1:1"
  }'
            

Music Generation

                cURL
                
            

curl -X POST https://api.onysoft.com/v1/video/generate \
  -H "Authorization: Bearer sk-ony-YOUR-KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "suno-v4",
    "prompt": "Upbeat electronic dance music with catchy female vocals about summer"
  }'
            

Example Response

                JSON Response
            

{
  "success": true,
  "data": {
    "task_id": "vid_abc123...",
    "status": "pending",
    "model": "veo3_fast",
    "estimated_cost": {
      "amount": 0.60,
      "currency": "USD"
    }
  }
}
            

Status Lookup

GET /v1/video/status/{task_id}

Queries the video generation status.

Status Values

Status	Description
`pending`	Request received, queued
`processing`	Video generation in progress
`completed`	Video ready, URL available
`failed`	Generation failed

Example Request

cURL

curl https://api.onysoft.com/v1/video/status/vid_abc123 \
  -H "Authorization: Bearer sk-ony-YOUR-KEY"

Example Response (Completed)

                JSON Response
            

{
  "success": true,
  "data": {
    "task_id": "vid_abc123...",
    "status": "completed",
    "video_url": "https://...",
    "thumbnail_url": "https://...",
    "duration": 5,
    "model": "veo3_fast"
  }
}
            

GET /v1/video/list

Returns a list of generated videos.

Query Parameters

page integer optional

Page number. Default: 1

per_page integer optional

Records per page. Default: 20, Maximum: 100

Supported Models

The system provides access to 200+ KieAI models. The model list is dynamic and synced from the admin panel. For the current model list and prices, use the Models page or the GET /v1/models endpoint.

Video Models (Featured)

Model	Provider	Type	Description
`veo3`	Google	Video	Veo 3.1 - Highest quality (t2v/i2v)
`veo3_fast`	Google	Video	Veo 3.1 Fast - Fast generation
`sora-2-text-to-video`	OpenAI	Video	Sora 2 text-to-video
`sora-2-pro-text-to-video`	OpenAI	Video	Sora 2 Pro - high quality
`runway_gen4_turbo`	Runway	Video	Gen-4 Turbo
`runway_aleph`	Runway	Video	Aleph (video-to-video)
`kling/v2-5-turbo-text-to-video-pro`	Kling	Video	Kling 2.5 Turbo Pro
`kling-2.6/text-to-video`	Kling	Video	Kling 2.6
`wan/2-6-text-to-video`	Alibaba	Video	Wan 2.6
`hailuo/2-3-image-to-video-pro`	Hailuo	Video	Hailuo 2.3 Pro (i2v)
`grok-imagine/text-to-video`	xAI	Video	Grok Imagine Video

Image Models (Featured)

Model	Provider	Type	Description
`grok-imagine/text-to-image`	xAI	t2i	Grok Imagine (most compatible)
`google/imagen4`	Google	t2i	Imagen 4 Standard
`google/imagen4-fast`	Google	t2i	Imagen 4 Fast
`google/imagen4-ultra`	Google	t2i	Imagen 4 Ultra - Highest quality
`google/nano-banana-2`	Google	t2i	Nano Banana 2 (1K/2K/4K)
`ideogram/v3-text-to-image`	Ideogram	t2i	Ideogram V3 (text-to-image)
`ideogram/v3-edit`	Ideogram	i2i	Ideogram V3 Edit
`flux-2/pro-text-to-image`	Flux	t2i	Flux 2 Pro
`flux-2/flex-text-to-image`	Flux	t2i	Flux 2 Flex
`seedream/4.5-text-to-image`	ByteDance	t2i	Seedream 4.5 (text-to-image)
`seedream/4.5-edit`	ByteDance	i2i	Seedream 4.5 Edit (image-to-image)
`gpt-image/1.5-text-to-image`	OpenAI	t2i	GPT Image 1.5
`wan/2-7-image-pro`	Alibaba	t2i	Wan 2.7 Image Pro

Audio Models (TTS / Sound Effect / Speech)

Audio generation via the ElevenLabs provider. Recommended endpoint: POST /v1/audio/generate or POST /v1/voice/generate.

Model	Type	Description	Price (USD)
`elevenlabs/text-to-speech-multilingual-v2`	TTS	Multilingual TTS (29 languages including Turkish). Recommended.	$0.060
`elevenlabs/text-to-speech-turbo-2-5`	TTS	Low-latency TTS (English-weighted only). Currently experiencing an "internal error" on the provider side.	$0.030
`elevenlabs/v3-text-to-dialogue`	TTS Dialogue	Multi-character dialogue generation (V3 model)	$0.070
`elevenlabs/sound-effect-v2`	SFX	Sound effect generation (prompt → audio)	$0.001
`elevenlabs/audio-isolation`	Audio Filter	Vocal/instrument isolation	$0.001
`elevenlabs/speech-to-text`	STT	Speech to text (transcription)	$0.0175

TTS Parameters

Parameter	Type	Default	Description
`prompt`	string	—	The text to voice (required)
`voice`	string	`"Rachel"`	Voice identifier. A value from the list below (case-sensitive).
`stability`	float	0.5	Voice consistency (0.0 - 1.0). High = monotone, low = expressive.
`similarity_boost`	float	0.75	Voice similarity (0.0 - 1.0).
`style`	float	0.0	Style intensity (0.0 - 1.0). Higher = more characterful.

Supported Voice List (ElevenLabs)

In the ElevenLabs integration via KieAI, only the following 20 voices are available. The values are case-sensitive — the first letter must be capitalized. If you send a voice outside the list, KieAI returns the error "This voice is not within the range of allowed options".

Female	Male
`Rachel` (default), `Alice`, `Aria`, `Charlotte`, `Jessica`, `Laura`, `Lily`, `Matilda`, `Sarah`	`Brian`, `Bill`, `Callum`, `Charlie`, `Chris`, `Daniel`, `Eric`, `George`, `Liam`, `Roger`, `Will`

info

For Turkish TTS

Use the elevenlabs/text-to-speech-multilingual-v2 model — it provides natural narration in 29 languages including Turkish. The turbo-2-5 model is English-weighted only and is currently having a temporary issue on the provider side.

TTS Example (Turkish, multilingual-v2)

                cURL
                
            

curl -X POST https://api.onysoft.com/v1/voice/generate \
  -H "Authorization: Bearer sk-ony-YOUR-KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "elevenlabs/text-to-speech-multilingual-v2",
    "prompt": "Merhaba, ben Rachel. Onysoft AI üzerinden konuşuyorum.",
    "voice": "Rachel",
    "stability": 0.6,
    "similarity_boost": 0.8,
    "style": 0.1
  }'
            

Sound Effect Parameters

Parameter	Type	Default	Description
`prompt`	string	—	Sound effect description (e.g. "rain on metal roof")
`duration_seconds`	float	5.0	Generation length (seconds). Between 1.0 and 22.0.
`prompt_influence`	float	0.3	Fidelity to the prompt (0.0 - 1.0).

Sound Effect Example

                cURL
                
            

curl -X POST https://api.onysoft.com/v1/audio/generate \
  -H "Authorization: Bearer sk-ony-YOUR-KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "elevenlabs/sound-effect-v2",
    "prompt": "thunder rolling in distant mountains, heavy rain",
    "duration_seconds": 8.0,
    "prompt_influence": 0.5
  }'
            

Music Models

Suno and KieAI ai-music-api models for song/music generation. Recommended endpoint: POST /v1/music/generate.

Model	Provider	Description
`suno-v4`	Suno	Suno V4 — Song generation (instrumental/vocal, multiple genres)
`ai-music-api/generate`	KieAI	Create a new song (Suno behind the scenes)
`ai-music-api/extend`	KieAI	Extend an existing song
`ai-music-api/mashup`	KieAI	Combine 2 songs
`ai-music-api/sounds`	KieAI	Sound effects
`ai-music-api/upload-and-cover-audio`	KieAI	Generate a cover from existing audio
`ai-music-api/separate-vocals`	KieAI	Vocal/instrument separation
`ai-music-api/create-music-video`	KieAI	Generate a visual video for music
`ai-music-api/add-instrumental`	KieAI	Add instrumentals to existing vocals
`ai-music-api/convert-to-wav-format`	KieAI	MP3 → WAV conversion
`suno-add-vocals`	KieAI	Add vocals to an instrumental
`suno-extend-music`	KieAI	Extend a Suno track
`suno-generate-lyrics`	KieAI	Lyrics generation

attach_money

Current Prices

Prices are pulled dynamically from the KieAI API. They are shown with the markup rate applied to your project/customer. For current pricing: call GET /v1/models or see the models page.

Per-Model Required Parameters

Each provider requires different parameters. The table below shows the parameters required for each model family. These parameters are filled in automatically on the backend — you only need to send prompt and, optionally, aspect_ratio.

Provider	Model Family	Auto-Filled Parameters
Google	imagen4 / imagen4-fast / imagen4-ultra	prompt (only)
xAI	grok-imagine/*	aspect_ratio
Ideogram	ideogram/v3-text-to-image	prompt (only)
Ideogram	ideogram/v3-edit, v3-remix	prompt + image_url (i2i)
Ideogram	ideogram/character*	reference_image_urls (i2i)
Flux	flux-2/*	aspect_ratio, resolution (1K/2K)
ByteDance	seedream/*	quality, n, response_format, size, resolution (2K), aspect_ratio, background
OpenAI	gpt-image/*	quality, n, response_format, size, resolution, aspect_ratio, background
Alibaba	wan/2-7-image-*	aspect_ratio, resolution (2K/4K)
Alibaba	wan video models	duration, resolution (720p/1080p)
Google	nano-banana-edit	image_urls (i2i - image required)

lightbulb

i2i / i2v (image-to-image / image-to-video) Models

When sending a source image for image or video generation, follow this rule:

Single image: send image_url (string)
Multiple images (reference): send image_urls (array)
The backend automatically converts to the actual parameter name each provider expects (imageUrl, input_urls, first_frame_url, image_input, etc.)
As a value, use a public HTTP URL (https://...) or a base64 data URI (data:image/jpeg;base64,...) — if base64, the system automatically converts it to a public URL
Maximum total request size: 10 MB

Provider Parameter Mapping Matrix

The table below shows, for advanced use, the internal parameter names each provider expects. You do not need to know this detail in normal use — sending image_url or image_urls is enough; the backend converts automatically.

Model Family	Expected Parameter	Type	Note
`seedream/*-edit`	`image_urls`	array	Supports multiple references
`flux-2/*-image-to-image`	`input_urls`	array	Supports resolution 1K/2K
`gpt-image-2-image-to-image`	`input_urls`	array	aspect_ratio + resolution required
`google/nano-banana-edit`	`image_urls`	array	output_format (png/jpg) + image_size
`google/nano-banana-2` / `pro`	`image_input`	array	An empty array can also be sent (t2i mode)
`ideogram/v3-edit` / `remix`	`image_url`	string	+ `mask_url` (optional)
`ideogram/character*`	`reference_image_urls`	array	For character reference
`grok-imagine/image-to-image`	`image_urls`	array	nsfw_checker support
`grok-imagine/image-to-video`	`image_urls`	array	+ duration, resolution, mode
`kling-*/image-to-video`	`image_url`	string	+ aspect_ratio, duration
`hailuo/*-image-to-video`	`image_url`	string	+ duration (6/10)
`sora-2-image-to-video`	`image_urls`	array	+ n_frames, aspect_ratio (landscape/portrait)
`wan/2-7-image-to-video`	`first_frame_url`	string	+ `last_frame_url` (optional closing frame)
`wan/2-6-flash-image-to-video`	`image_urls`	array	+ audio (bool, required)
`runway_*`	`imageUrl`	string (camelCase)	Runway internal format
`veo3*` reference-to-video	`imageUrls`	array (camelCase)	+ generationType="REFERENCE_2_VIDEO"
`topaz/*`	`image_url`	string	+ scale (2/4) — upscaler
`elevenlabs/audio-isolation`	`audio_url`	string	Not an image, audio
`elevenlabs/speech-to-text`	`audio_url`	string	Not an image, audio

Sora 2 (OpenAI) Models

OpenAI's new Sora 2 video generation models. High quality, natural motion, with audio.

Model	Type	Description
`sora-2-text-to-video`	t2v	Text-to-video (standard quality)
`sora-2-image-to-video`	i2v	Video from a starting image
`sora-2-pro-text-to-video`	t2v	Sora 2 Pro — high-quality t2v
`sora-2-pro-image-to-video`	i2v	Sora 2 Pro — high-quality i2v

Sora 2 Parameters

Parameter	Default	Description
`aspect_ratio`	"landscape"	`"landscape"` or `"portrait"`
`n_frames`	"10"	Seconds/frame count (string)
`size`	"high"	Pro variant only: `"standard"` or `"high"`
`remove_watermark`	false	Watermark removal
`character_id_list`	—	Character reference IDs (array)
`image_urls`	—	Required for i2v (array)

Sora 2 Example (Pro, landscape)

cURL

curl -X POST https://api.onysoft.com/v1/video/generate \
  -H "Authorization: Bearer sk-ony-YOUR-KEY" \
  -d '{
    "model": "sora-2-pro-text-to-video",
    "prompt": "A drone shot of a serene mountain lake at sunrise, mist rising from water",
    "aspect_ratio": "landscape",
    "n_frames": "10",
    "size": "high"
  }'
            

Note: the price you send to the API is the customer price with your project's/customer's markup rate applied. Average markup: 50% (projects) and 30% (partner customers).

Full Python Example (Polling)

                Python
                
            

import time
import requests

API_KEY = "sk-ony-YOUR-KEY"
BASE = "https://api.onysoft.com/v1"

# 1. Video oluştur
resp = requests.post(f"{BASE}/video/generate",
    headers={"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"},
    json={"model": "veo3_fast", "prompt": "A cat playing piano", "duration": 5}
)
task = resp.json()["data"]
task_id = task["task_id"]
print(f"Task: {task_id}, Status: {task['status']}")

# 2. Durum sorgula (polling)
while True:
    resp = requests.get(f"{BASE}/video/status/{task_id}",
        headers={"Authorization": f"Bearer {API_KEY}"})
    data = resp.json()["data"]
    print(f"Status: {data['status']}")
    if data["status"] == "completed":
        print(f"Video URL: {data['video_url']}")
        break
    elif data["status"] == "failed":
        print(f"Hata: {data.get('error')}")
        break
    time.sleep(10)  # 10 saniye bekle
            

Content Generation API (Video / Image / Audio / Music)

Endpoint Names (Aliases)

Generate Video

Parameters

Example Requests

Video Generation (Text-to-Video)

Image Generation (Text-to-Image)

Image Editing (Image-to-Image)

Music Generation

Example Response

Status Lookup

Status Values

Example Request

Example Response (Completed)

Video List

Query Parameters

Supported Models

Video Models (Featured)

Image Models (Featured)

Audio Models (TTS / Sound Effect / Speech)

TTS Parameters

Supported Voice List (ElevenLabs)

TTS Example (Turkish, multilingual-v2)

Sound Effect Parameters

Sound Effect Example

Music Models

Per-Model Required Parameters

Provider Parameter Mapping Matrix

Sora 2 (OpenAI) Models

Sora 2 Parameters

Sora 2 Example (Pro, landscape)

Full Python Example (Polling)