Content Generation API (Video / Image / Audio / Music)

Generate video, image, audio (TTS) and music with Onysoft AI Gateway. Use 40+ provider models such as Veo, Sora 2, Runway, Kling, Wan, Hailuo, Seedream, Flux, GPT Image, Imagen, Nano Banana, Ideogram, Grok Imagine, ElevenLabs and Suno through a single asynchronous backend.

bolt

5 endpoints, 1 shared backend

There are 5 endpoint names for the generation flow, and they all connect to the same backend. Which one you use is purely semantic — it is meant to make your code more readable. The returned task_id is queried the same way for all of them.

Endpoint Names (Aliases)

The 5 endpoints below are fully equivalent. Whichever you send, the same backend runs and the same task_id format is returned. For semantic clarity, prefer the one that fits your model type.

EndpointRecommended UseExample Models
POST /v1/video/generate Video generation veo3, sora-2, runway, kling, wan, hailuo
POST /v1/image/generate Image generation seedream, flux, imagen, nano-banana, ideogram, grok-imagine, gpt-image
POST /v1/audio/generate Audio generation (general) elevenlabs/text-to-speech, sound-effect, audio-isolation, speech-to-text
POST /v1/voice/generate Speech generation (TTS) elevenlabs/text-to-speech-multilingual-v2, turbo-2-5, v3-text-to-dialogue
POST /v1/music/generate Music generation suno-v4, ai-music-api/generate, mashup, extend, sounds
info

Status Lookup Endpoints

Whichever alias you used for generation, you can use the same alias for status lookup (or any of them — the task_id is global).

  • GET /v1/video/status/{task_id}
  • GET /v1/image/status/{task_id}
  • GET /v1/audio/status/{task_id}
  • GET /v1/voice/status/{task_id}
  • GET /v1/music/status/{task_id}
info

Asynchronous Design

All of these endpoints work asynchronously. When you send a request you receive a task_id (HTTP 202). You use this ID to query the generation status and check whether your content is ready. Typical duration: image 10-30s, video 30-180s, TTS 5-15s, music 30-90s.

bolt

Text/Chat Exception

For text/chat models, use /v1/chat/completions (synchronous, streaming-capable — see Chat Completions). Special case: Google Gemini image models (gemini-*-image) are called through the chat completions endpoint (multimodal response).

Generate Video

POST /v1/video/generate

Starts a new video, image or music generation task.

Parameters

model string required

The video model. Example: veo3_fast, kling-2.5, runway_gen4_turbo

prompt string required

Video description (English recommended). Detailed and descriptive prompts produce better results.

aspect_ratio string optional

Aspect ratio: "16:9", "9:16", "1:1". Default: "16:9"

duration integer optional

Video length (seconds). Default: 5. Depending on the model, 5, 6, 8 or 10 seconds are supported.

image_url string required for i2v/i2i

Source image. Required for image-to-video and image-to-image models. Can be a public HTTP URL or a base64 data URI.

image string optional

Alternative to image_url. A base64-encoded image. The system automatically saves it to a temp file and produces a public URL.

resolution string optional

Resolution. Supports different values per model: "720p", "1080p" (video), "1K", "2K", "4K" (image). The backend selects the correct value automatically.

quality string optional

Quality level. "high", "standard", "low". Used only with the gpt-image and seedream models.

negative_prompt string optional

Unwanted elements. Specify which object/style/color you do not want generated.

Example Requests

Video Generation (Text-to-Video)

cURL
curl -X POST https://api.onysoft.com/v1/video/generate \
  -H "Authorization: Bearer sk-ony-YOUR-KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "veo3_fast",
    "prompt": "A golden retriever running through a field of sunflowers at sunset",
    "aspect_ratio": "16:9",
    "duration": 5
  }'

Image Generation (Text-to-Image)

cURL
curl -X POST https://api.onysoft.com/v1/video/generate \
  -H "Authorization: Bearer sk-ony-YOUR-KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/imagen4",
    "prompt": "A beautiful sunset over mountains, photorealistic, 4K",
    "aspect_ratio": "16:9"
  }'

Image Editing (Image-to-Image)

cURL
curl -X POST https://api.onysoft.com/v1/video/generate \
  -H "Authorization: Bearer sk-ony-YOUR-KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "seedream/4.5-edit",
    "prompt": "Change background to a futuristic city",
    "image_url": "https://example.com/input.jpg",
    "aspect_ratio": "1:1"
  }'

Music Generation

cURL
curl -X POST https://api.onysoft.com/v1/video/generate \
  -H "Authorization: Bearer sk-ony-YOUR-KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "suno-v4",
    "prompt": "Upbeat electronic dance music with catchy female vocals about summer"
  }'

Example Response

JSON Response
{
  "success": true,
  "data": {
    "task_id": "vid_abc123...",
    "status": "pending",
    "model": "veo3_fast",
    "estimated_cost": {
      "amount": 0.60,
      "currency": "USD"
    }
  }
}

Status Lookup

GET /v1/video/status/{task_id}

Queries the video generation status.

Status Values

StatusDescription
pendingRequest received, queued
processingVideo generation in progress
completedVideo ready, URL available
failedGeneration failed

Example Request

cURL
curl https://api.onysoft.com/v1/video/status/vid_abc123 \
  -H "Authorization: Bearer sk-ony-YOUR-KEY"

Example Response (Completed)

JSON Response
{
  "success": true,
  "data": {
    "task_id": "vid_abc123...",
    "status": "completed",
    "video_url": "https://...",
    "thumbnail_url": "https://...",
    "duration": 5,
    "model": "veo3_fast"
  }
}

Video List

GET /v1/video/list

Returns a list of generated videos.

Query Parameters

page integer optional

Page number. Default: 1

per_page integer optional

Records per page. Default: 20, Maximum: 100

Supported Models

The system provides access to 200+ KieAI models. The model list is dynamic and synced from the admin panel. For the current model list and prices, use the Models page or the GET /v1/models endpoint.

Video Models (Featured)

ModelProviderTypeDescription
veo3GoogleVideoVeo 3.1 - Highest quality (t2v/i2v)
veo3_fastGoogleVideoVeo 3.1 Fast - Fast generation
sora-2-text-to-videoOpenAIVideoSora 2 text-to-video
sora-2-pro-text-to-videoOpenAIVideoSora 2 Pro - high quality
runway_gen4_turboRunwayVideoGen-4 Turbo
runway_alephRunwayVideoAleph (video-to-video)
kling/v2-5-turbo-text-to-video-proKlingVideoKling 2.5 Turbo Pro
kling-2.6/text-to-videoKlingVideoKling 2.6
wan/2-6-text-to-videoAlibabaVideoWan 2.6
hailuo/2-3-image-to-video-proHailuoVideoHailuo 2.3 Pro (i2v)
grok-imagine/text-to-videoxAIVideoGrok Imagine Video

Image Models (Featured)

ModelProviderTypeDescription
grok-imagine/text-to-imagexAIt2iGrok Imagine (most compatible)
google/imagen4Googlet2iImagen 4 Standard
google/imagen4-fastGooglet2iImagen 4 Fast
google/imagen4-ultraGooglet2iImagen 4 Ultra - Highest quality
google/nano-banana-2Googlet2iNano Banana 2 (1K/2K/4K)
ideogram/v3-text-to-imageIdeogramt2iIdeogram V3 (text-to-image)
ideogram/v3-editIdeogrami2iIdeogram V3 Edit
flux-2/pro-text-to-imageFluxt2iFlux 2 Pro
flux-2/flex-text-to-imageFluxt2iFlux 2 Flex
seedream/4.5-text-to-imageByteDancet2iSeedream 4.5 (text-to-image)
seedream/4.5-editByteDancei2iSeedream 4.5 Edit (image-to-image)
gpt-image/1.5-text-to-imageOpenAIt2iGPT Image 1.5
wan/2-7-image-proAlibabat2iWan 2.7 Image Pro

Audio Models (TTS / Sound Effect / Speech)

Audio generation via the ElevenLabs provider. Recommended endpoint: POST /v1/audio/generate or POST /v1/voice/generate.

ModelTypeDescriptionPrice (USD)
elevenlabs/text-to-speech-multilingual-v2TTSMultilingual TTS (29 languages including Turkish). Recommended.$0.060
elevenlabs/text-to-speech-turbo-2-5TTSLow-latency TTS (English-weighted only). Currently experiencing an "internal error" on the provider side.$0.030
elevenlabs/v3-text-to-dialogueTTS DialogueMulti-character dialogue generation (V3 model)$0.070
elevenlabs/sound-effect-v2SFXSound effect generation (prompt → audio)$0.001
elevenlabs/audio-isolationAudio FilterVocal/instrument isolation$0.001
elevenlabs/speech-to-textSTTSpeech to text (transcription)$0.0175

TTS Parameters

ParameterTypeDefaultDescription
promptstringThe text to voice (required)
voicestring"Rachel"Voice identifier. A value from the list below (case-sensitive).
stabilityfloat0.5Voice consistency (0.0 - 1.0). High = monotone, low = expressive.
similarity_boostfloat0.75Voice similarity (0.0 - 1.0).
stylefloat0.0Style intensity (0.0 - 1.0). Higher = more characterful.

Supported Voice List (ElevenLabs)

In the ElevenLabs integration via KieAI, only the following 20 voices are available. The values are case-sensitive — the first letter must be capitalized. If you send a voice outside the list, KieAI returns the error "This voice is not within the range of allowed options".

FemaleMale
Rachel (default), Alice, Aria, Charlotte, Jessica, Laura, Lily, Matilda, Sarah Brian, Bill, Callum, Charlie, Chris, Daniel, Eric, George, Liam, Roger, Will
info

For Turkish TTS

Use the elevenlabs/text-to-speech-multilingual-v2 model — it provides natural narration in 29 languages including Turkish. The turbo-2-5 model is English-weighted only and is currently having a temporary issue on the provider side.

TTS Example (Turkish, multilingual-v2)

cURL
curl -X POST https://api.onysoft.com/v1/voice/generate \
  -H "Authorization: Bearer sk-ony-YOUR-KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "elevenlabs/text-to-speech-multilingual-v2",
    "prompt": "Merhaba, ben Rachel. Onysoft AI üzerinden konuşuyorum.",
    "voice": "Rachel",
    "stability": 0.6,
    "similarity_boost": 0.8,
    "style": 0.1
  }'

Sound Effect Parameters

ParameterTypeDefaultDescription
promptstringSound effect description (e.g. "rain on metal roof")
duration_secondsfloat5.0Generation length (seconds). Between 1.0 and 22.0.
prompt_influencefloat0.3Fidelity to the prompt (0.0 - 1.0).

Sound Effect Example

cURL
curl -X POST https://api.onysoft.com/v1/audio/generate \
  -H "Authorization: Bearer sk-ony-YOUR-KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "elevenlabs/sound-effect-v2",
    "prompt": "thunder rolling in distant mountains, heavy rain",
    "duration_seconds": 8.0,
    "prompt_influence": 0.5
  }'

Music Models

Suno and KieAI ai-music-api models for song/music generation. Recommended endpoint: POST /v1/music/generate.

ModelProviderDescription
suno-v4SunoSuno V4 — Song generation (instrumental/vocal, multiple genres)
ai-music-api/generateKieAICreate a new song (Suno behind the scenes)
ai-music-api/extendKieAIExtend an existing song
ai-music-api/mashupKieAICombine 2 songs
ai-music-api/soundsKieAISound effects
ai-music-api/upload-and-cover-audioKieAIGenerate a cover from existing audio
ai-music-api/separate-vocalsKieAIVocal/instrument separation
ai-music-api/create-music-videoKieAIGenerate a visual video for music
ai-music-api/add-instrumentalKieAIAdd instrumentals to existing vocals
ai-music-api/convert-to-wav-formatKieAIMP3 → WAV conversion
suno-add-vocalsKieAIAdd vocals to an instrumental
suno-extend-musicKieAIExtend a Suno track
suno-generate-lyricsKieAILyrics generation
attach_money

Current Prices

Prices are pulled dynamically from the KieAI API. They are shown with the markup rate applied to your project/customer. For current pricing: call GET /v1/models or see the models page.

Per-Model Required Parameters

Each provider requires different parameters. The table below shows the parameters required for each model family. These parameters are filled in automatically on the backend — you only need to send prompt and, optionally, aspect_ratio.

ProviderModel FamilyAuto-Filled Parameters
Googleimagen4 / imagen4-fast / imagen4-ultraprompt (only)
xAIgrok-imagine/*aspect_ratio
Ideogramideogram/v3-text-to-imageprompt (only)
Ideogramideogram/v3-edit, v3-remixprompt + image_url (i2i)
Ideogramideogram/character*reference_image_urls (i2i)
Fluxflux-2/*aspect_ratio, resolution (1K/2K)
ByteDanceseedream/*quality, n, response_format, size, resolution (2K), aspect_ratio, background
OpenAIgpt-image/*quality, n, response_format, size, resolution, aspect_ratio, background
Alibabawan/2-7-image-*aspect_ratio, resolution (2K/4K)
Alibabawan video modelsduration, resolution (720p/1080p)
Googlenano-banana-editimage_urls (i2i - image required)
lightbulb

i2i / i2v (image-to-image / image-to-video) Models

When sending a source image for image or video generation, follow this rule:

  • Single image: send image_url (string)
  • Multiple images (reference): send image_urls (array)
  • The backend automatically converts to the actual parameter name each provider expects (imageUrl, input_urls, first_frame_url, image_input, etc.)
  • As a value, use a public HTTP URL (https://...) or a base64 data URI (data:image/jpeg;base64,...) — if base64, the system automatically converts it to a public URL
  • Maximum total request size: 10 MB

Provider Parameter Mapping Matrix

The table below shows, for advanced use, the internal parameter names each provider expects. You do not need to know this detail in normal use — sending image_url or image_urls is enough; the backend converts automatically.

Model FamilyExpected ParameterTypeNote
seedream/*-editimage_urlsarraySupports multiple references
flux-2/*-image-to-imageinput_urlsarraySupports resolution 1K/2K
gpt-image-2-image-to-imageinput_urlsarrayaspect_ratio + resolution required
google/nano-banana-editimage_urlsarrayoutput_format (png/jpg) + image_size
google/nano-banana-2 / proimage_inputarrayAn empty array can also be sent (t2i mode)
ideogram/v3-edit / remiximage_urlstring+ mask_url (optional)
ideogram/character*reference_image_urlsarrayFor character reference
grok-imagine/image-to-imageimage_urlsarraynsfw_checker support
grok-imagine/image-to-videoimage_urlsarray+ duration, resolution, mode
kling-*/image-to-videoimage_urlstring+ aspect_ratio, duration
hailuo/*-image-to-videoimage_urlstring+ duration (6/10)
sora-2-image-to-videoimage_urlsarray+ n_frames, aspect_ratio (landscape/portrait)
wan/2-7-image-to-videofirst_frame_urlstring+ last_frame_url (optional closing frame)
wan/2-6-flash-image-to-videoimage_urlsarray+ audio (bool, required)
runway_*imageUrlstring (camelCase)Runway internal format
veo3* reference-to-videoimageUrlsarray (camelCase)+ generationType="REFERENCE_2_VIDEO"
topaz/*image_urlstring+ scale (2/4) — upscaler
elevenlabs/audio-isolationaudio_urlstringNot an image, audio
elevenlabs/speech-to-textaudio_urlstringNot an image, audio

Sora 2 (OpenAI) Models

OpenAI's new Sora 2 video generation models. High quality, natural motion, with audio.

ModelTypeDescription
sora-2-text-to-videot2vText-to-video (standard quality)
sora-2-image-to-videoi2vVideo from a starting image
sora-2-pro-text-to-videot2vSora 2 Pro — high-quality t2v
sora-2-pro-image-to-videoi2vSora 2 Pro — high-quality i2v

Sora 2 Parameters

ParameterDefaultDescription
aspect_ratio"landscape""landscape" or "portrait"
n_frames"10"Seconds/frame count (string)
size"high"Pro variant only: "standard" or "high"
remove_watermarkfalseWatermark removal
character_id_listCharacter reference IDs (array)
image_urlsRequired for i2v (array)

Sora 2 Example (Pro, landscape)

cURL
curl -X POST https://api.onysoft.com/v1/video/generate \
  -H "Authorization: Bearer sk-ony-YOUR-KEY" \
  -d '{
    "model": "sora-2-pro-text-to-video",
    "prompt": "A drone shot of a serene mountain lake at sunrise, mist rising from water",
    "aspect_ratio": "landscape",
    "n_frames": "10",
    "size": "high"
  }'

Note: the price you send to the API is the customer price with your project's/customer's markup rate applied. Average markup: 50% (projects) and 30% (partner customers).

Full Python Example (Polling)

Python
import time
import requests

API_KEY = "sk-ony-YOUR-KEY"
BASE = "https://api.onysoft.com/v1"

# 1. Video oluştur
resp = requests.post(f"{BASE}/video/generate",
    headers={"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"},
    json={"model": "veo3_fast", "prompt": "A cat playing piano", "duration": 5}
)
task = resp.json()["data"]
task_id = task["task_id"]
print(f"Task: {task_id}, Status: {task['status']}")

# 2. Durum sorgula (polling)
while True:
    resp = requests.get(f"{BASE}/video/status/{task_id}",
        headers={"Authorization": f"Bearer {API_KEY}"})
    data = resp.json()["data"]
    print(f"Status: {data['status']}")
    if data["status"] == "completed":
        print(f"Video URL: {data['video_url']}")
        break
    elif data["status"] == "failed":
        print(f"Hata: {data.get('error')}")
        break
    time.sleep(10)  # 10 saniye bekle
Want help finding the right model?