Documentation

Introduction

Back to docs

What is imtovid?

Imtovid is a hosted generator that turns a short text prompt (and optionally a reference image) into a downloadable video clip. There is no extra editor or timeline—everything lives inside the /image-to-video route and the playground form you see in the product screenshots.

Supported inputs

  • Aspect ratio — 16:9, 1:1, or 9:16. Pick this before you render so the clip fits the target channel.
  • Duration — today imtovid renders concise 5-second shots; more presets are coming.
  • Prompt — describe the subject and motion in plain language.
  • Start image (optional) — upload a PNG/JPG or paste a URL if you want the first frame to match a product photo or storyboard frame.
  • Negative prompt (optional) — list colors, props, styles, or objects that should be avoided.

What the generator returns

Every run produces:

  • A browser preview with playback controls.
  • Download, share, tweak, and “iterate in playground” buttons so you can immediately reuse the prompt.
  • Run statistics (duration, logs) to help you understand how long the render took.

Current limits

  • One clip at a time (batching lives on the pricing page roadmap).
  • Start images should be common formats (PNG/JPG/WebP) and under 25 MB for best results.
  • Audio is not generated—the focus is on motion and camera work.

If you only remember one thing: imtovid accepts a prompt, an optional start image, and an optional negative prompt. Everything else in the UI is there to make those inputs faster to adjust.