Translating an SRT is harder than translating prose
Drop a French novel into Google Translate and you get a usable English novel. Drop a French SRT into Google Translate and you get a corrupted file with broken timestamps, glued cues, and lines that drift off-screen because the new language took 40% more characters than the old one. Subtitles look like prose; they behave like code.
VideoCue's Subtitle Translator is built for the file format, not the language. It parses your SRT into a list of cues, translates each cue independently while passing surrounding cues as context, then re-assembles a perfectly-formed SRT with every timestamp intact. The output drops back into Premiere, DaVinci, FCP, or any web player without a single re-sync.
Why per-cue translation matters
A standard SRT cue looks like:
CODEBLOCK0
Three things have to survive translation: the index (12), the in/out timecodes (00:01:24,500 --> 00:01:27,200), and the text. A naive translation pipeline runs the whole file as one string through a translator โ and the translator dutifully "translates" the timecodes, replacing decimals with commas, breaking line numbers, and silently shifting cues. By the time you notice, you've lost an hour.
VideoCue's approach: parse the file structurally, hold every non-text field as-is, send only the text body through Cue. The structural fields can't be corrupted because they never touch the translation model. The text comes back in the target language, gets re-inserted, and the file is re-serialised byte-perfect.
Context-aware translation
A 5-word cue in isolation is a hard translation target. "He was wrong about everything" could be a confession, an accusation, or a punchline depending on what came before. Translating it without context strips it of register and tone.
Our tool passes the 2-3 cues before and after each cue to Cue as context. The translation model sees the local narrative, picks the right register, and reads idioms correctly. The result reads as natural target-language prose, not as a flattened literal translation.
Where this matters most:
- Idioms and metaphors. "Bite the bullet" translated literally is gibberish in French. With context, Cue picks the right local equivalent ("se jeter ร l'eau" or similar).
- Pronoun gender. Many languages distinguish gender where English doesn't. Context lets Cue infer which gender to use.
- Formal vs casual register. "You" is one word in English, multiple in French/German/Japanese. Surrounding dialogue tells Cue whether the speaker is being formal or casual.
- Dramatic emphasis. Italics, exclamation, and pacing cues all read more naturally when the model has seen the buildup.
The languages supported
Currently: English, Spanish (ES, LATAM), French, German, Italian, Portuguese (PT, BR), Dutch, Polish, Russian, Turkish, Arabic, Hebrew, Japanese, Korean, Chinese (Simplified, Traditional), Vietnamese, Thai, Indonesian, Filipino, Hindi, and a growing list of African and Nordic languages.
Translation quality is strongest in the major language pairs (EN โ ES, FR, DE, PT, JA, ZH). Less-resourced pairs (e.g. EN โ Swahili) are functional but you should expect to do a human polish pass. We surface a confidence indicator in the output for any cue Cue is less certain about.
Length expansion and timing
Translation almost always changes character count. German tends to expand 30-40% over English. Japanese and Chinese compress 40-50%. Spanish and French expand 10-25%. Your timing was set against the source text โ when the target text is longer, you risk subtitles that linger past their on-screen window.
Our tool flags any cue where the translated text is significantly longer than what fits comfortably at the source duration. You can shorten manually, or trust that most viewers read faster than the safe limits assume. For high-stakes work โ broadcast captioning, accessibility โ do a human-eye pass on every expansion warning.
Re-translating an already-translated file
Sometimes you have a translated SRT and want to translate it again into a third language. Our tool round-trips safely: the structure stays intact, and Cue handles the second-language source as confidently as the first. We don't recommend translating-through-English ("EN โ FR โ JA") for high-quality work, though โ go direct (FR โ JA) when possible, because every hop loses subtle context.
Privacy
Translation is stateless. Your SRT is sent to Cue for the translation pass and then discarded. We don't store, log, or train on subtitle content. Subtitles often contain unaired script content, so we treat the request the same as any high-trust content type.
When to upgrade
The free tool caps at 500 cues per file โ comfortable for most YouTube videos but tight for hour-long documentaries or feature-length narrative. Inside the paid VideoCue app, the cap is removed and translations can be queued in bulk across whole projects.
Related tools
- SRT Converter โ convert your file into SRT first if you've got VTT or plain text.
- YouTube Transcript โ pull a transcript from any video to translate.
- Chapter Generator โ translate chapter titles for international YouTube audiences.
- Filmwiki: SRT, VTT, Closed caption, Subtitle, Burn-in.