Question 1

How much does Seedance 2.0 cost on fal.ai?

Accepted Answer

Standard tier at `bytedance/seedance-2.0/text-to-video` bills at roughly $0.3034 per second of 720p output, with native audio included at no extra cost. A 5 second 720p render lands at $1.52. The Fast tier at `bytedance/seedance-2.0/fast/text-to-video` drops to $0.2419 per second on the same schema, about 20 percent cheaper for iteration. Reference-to-video with video refs applies a 0.6x duration multiplier, effectively $0.1814 per second. Math follows the token formula (height x width x duration x 24) divided by 1024 at $0.014 per 1k tokens on standard and $0.0112 per 1k on Fast. Validate at fal.ai/pricing.

Question 2

What is the max resolution?

Accepted Answer

Seedance 2.0 caps at 720p. You pick between 480p and 720p on `bytedance/seedance-2.0/text-to-video`, with no 1080p or 4K option. The legacy Seedance 1.5 Pro endpoint still reaches 1080p if you need it, and Veo 3.1 pushes to 4K for broadcast work. For social delivery, creator content, and preview reviews, 720p with native audio is where most teams land; the cinematic grade and token budget go further at 720p than stretched 1080p. Upscale downstream with a dedicated upscaler if your target needs more pixel density.

Question 3

How many reference images, videos, and audios can I pass?

Accepted Answer

Up to 12 files total split across three channels: 9 images, 3 videos, and 3 audios. That is the full multimodal surface on `bytedance/seedance-2.0/reference-to-video`. Images hold character, wardrobe, and composition anchors. Video refs drive camera move and motion rhythm. Audio refs drive room tone, ambience, and voice character. Duration caps still apply to the output: 4 to 15 seconds per single shot. Reference-to-video with video refs uses a 0.6x duration multiplier, so you pay for 60 percent of rendered time when video conditioning is active.

Question 4

Can I generate videos with images of real people?

Accepted Answer

The face filter that blocks generation of real individuals without identity verification sits at the ByteDance model layer, not at fal.ai. No API provider has a bypass. Operators who need portraits of real people route through identity verification and licensed talent pipelines, which is the standard across every major commercial video provider. AI-generated portraits (faces that do not match a real person) remain the documented path on `bytedance/seedance-2.0/reference-to-video` when you want character continuity without licensing overhead. Brand campaigns with cleared talent use the verified-identity intake.

Question 5

Seedance 2.0 vs Kling 3.0 Pro: which should I pick?

Accepted Answer

Pick `bytedance/seedance-2.0/text-to-video` when your brief uses multiple reference channels (image plus video plus audio), when you need 15 seconds of single-shot duration, and when you want native audio in a single pass. Pick Kling 3.0 Pro when you need 1080p output, when motion smoothness is the top priority, and when you are already in a Kling storyboarding flow. Seedance leads the head-to-head on I2V Arena Elo (1346 vs 1282) and on the multimodal surface. Kling leads on resolution ceiling and per-second price at 1080p.

Question 6

What is the Fast tier?

Accepted Answer

`bytedance/seedance-2.0/fast/text-to-video` and `bytedance/seedance-2.0/fast/reference-to-video` are quicker-turnaround variants at $0.2419 per second (about 20 percent cheaper than the $0.3034 standard tier). Same input schema, same 12-file multimodal surface, same 720p ceiling, same duration caps. Use Fast for iteration passes where you want 20 versions of a single shot before committing to the final render. Native audio remains included at no extra cost. Token formula shifts from $0.014 per 1k to $0.0112 per 1k, which is where the Fast savings come from.

Question 7

How do I call it from Python?

Accepted Answer

Install `fal-client`, set `FAL_KEY` in your environment, and subscribe to `bytedance/seedance-2.0/text-to-video`. The input dictionary mirrors the TypeScript SDK: prompt, duration (4 to 15), resolution (480p or 720p), aspect_ratio, generate_audio, and seed. Use `fal_client.subscribe` for synchronous waits or `fal_client.submit` for async jobs with webhooks. The queue returns logs you can stream with `with_logs=True`. Full schema and code shape live on the endpoint page under fal.ai/models. The Fast tier swaps the endpoint path to `bytedance/seedance-2.0/fast/text-to-video` without any other code change.

Question 8

What happens when a render fails?

Accepted Answer

The fal async queue behind `bytedance/seedance-2.0/text-to-video` returns structured errors with the rule id or timeout reason. Soft retries: drop the duration from 15 seconds to 8, switch the seed, or loosen the prompt if a content rule fired. Hard retries: route the same input to the Fast tier for a lighter compute path, or fall back to Seedance 1.5 Pro if you need 1080p. The queue logs surface in your dashboard, and webhooks report completion state to your server. Transient queue failures retry once automatically; committed failures surface a clear error code.

Question 9

Why run Seedance 2.0 on fal.ai?

Accepted Answer

Eight reasons. One, fal.ai is ByteDance's chosen enterprise partner for Seedance 2.0 with day-one access to all six endpoints. Two, a single FAL_KEY speaks to 600+ models, so your pipeline does not fragment across providers. Three, serverless scale with no cold starts plus an async queue that supports webhooks for fan-out. Four, the Fast tier at $0.2419 per second for iteration budgets on `bytedance/seedance-2.0/fast/text-to-video`. Five, regional points of presence for lower latency. Six, one `@fal-ai/client` SDK in TypeScript, Python, and Swift. Seven, free signup credits to kick the tires. Eight, Slack and Discord access to the fal team when a pipeline question needs a human.

Question 10

What formats and aspect ratios does Seedance 2.0 support?

Accepted Answer

`bytedance/seedance-2.0/text-to-video` accepts aspect ratios 21:9, 16:9, 4:3, 1:1, 3:4, 9:16, plus an auto mode that picks from the prompt. Duration ranges 4, 5, 8, 10, 12, and 15 seconds. Resolution is 480p or 720p. Output is MP4 with H.264 video and AAC audio when generate_audio is on. Seed input is supported for reproducible renders. For vertical social deliverables, 9:16 at 720p is the common pick; for cinematic widescreen, 21:9 is available natively without cropping downstream. Native audio runs at the same encode as the video track.

Seedance 2.0 API

Twelve-file multimodal video on fal.ai

See Seedance 2.0 in action.

Seedance 2.0 vs 1.5 Pro: What Actually Changed

Seedance 2.0at a glance.

Call Seedance 2.0in under 20 lines.

What Seedance 2.0costs on fal.ai.

Seedance 2.0vs the field.

Frequently asked.

Three to read first.

Debugging Seedance 2.0: Face Blocks, IP Warnings, and What to Do

Seedance 2.0 Fast Tier vs Standard: The Pricing Math

Image-to-Video: Character Consistency Patterns That Hold

Every topic we cover.

Technique

Troubleshooting

Pricing

Integration

Prompting

Use case

Comparisons

Comparison

Workflow

More on Technique.

Image-to-Video: Character Consistency Patterns That Hold

Native Audio Generation: When to Keep It On and When to Skip

Latest posts.

Integrating Seedance 2.0 Into a Production Render Queue

Native Audio Generation: When to Keep It On and When to Skip

Prompting Seedance 2.0 with 12-File Multimodal References

Reference-to-Video: Building a Brief With the Rule of 12

Seedance 2.0 vs Kling 3.0 Pro vs Veo 3.1: Who Wins When

When to Fall Back to Seedance 1.5 Pro

The numbers.

What we write about most.

Keep reading.The full blog is open.

Browse the full blog

Debugging Seedance 2.0: Face Blocks, IP Warnings, and What to Do