Audiovisual Translation Glossary

This Audiovisual Translation Glossary provides simple, jargon-free definitions of the most widely used terms in audiovisual translation!

Contact us

Welcome to our Audiovisual Translation Glossary! With this Audiovisual Translation Glossary we hope to save you the hassle of finding overly convoluted explanations of industry terms! Our glossary is full of simple, jargon-free definitions. We hope you find this useful!



Audio descriptionVoice-over narration, placed during natural pauses in the audio of a video or theatre play and sometimes during dialogue if deemed necessary.
Its purpose is to provide information on key visual elements, actions, etc. in audiovisual media for the benefit of the blind and visually impaired audience.
Block subtitleSubtitle that appears on screen for a set number of seconds, before being replaced by the next subtitle. They are usually edited (i.e. speech is summarised),
they are timed to shot change and their lines break at logical points in the text.
Burnt-in subtitleSubtitles that can’t be turned on and off in a video, as they are embedded in the video frames. Initially, subtitles were optically printed onto a film strip – hence the term “burnt”.
CaptionIn the United States and Canada “captions” are what is known in the United Kingdom and most other countries as “subtitles for the deaf and hard of hearing”.
See Subtitling for the Deaf and Hard of Hearing.
Character limitThe maximum character length of a subtitle line (including number of letters, spaces, punctuation and symbols). It varies depending on in-country accessibility standards, medium (e.g. streaming platform, broadcast TV),
language, etc. Character limits can vary significantly: for example, Netflix has a 42-character limit for alphabetical languages, while the number of characters should not exceed 37 per line according to the BBC.
Characters per secondReading speed measure in subtitling; number of characters shown on screen in the space of a second.
ClosedSaid of subtitles that can be turned on and off in a video. See Open.
CPSSee Characters per second.
CueingSee Time-coding.
Dialogue replacementProduction technique in which dialogue is re-recorded after the filming process.
In audiovisual translation, it consists of substituting the original audio with a translation of the dialogue that starts and ends where the original voice would.
However, it’s not done in a frame-accurate way, and lip movements aren’t synchronized. This makes it a cheaper approach than lip-sync dubbing, and it is generally used for e-Learning scenarios,
corporate presentations and informational videos.
DubbingSee Lip-sync dubbing.
Edited subtitlesSubtitles that don’t render dialogue word-for-word, but instead summarise, paraphrase or omit speech to varying degrees due to space and reading speed constraints.
Forced NarrativeText that clarifies speech or elements meant to be understood by the viewer of a video. They are displayed even when subtitles are not turned on.
Forced narrative subtitles are used in the following cases: to subtitle foreign/fictional language that differs from the original language of the video but is meant to be understood,
to translate original language graphics’ text, and to transcribe dialogue in the same language of the audio for clarity when audio is inaudible or distorted.
FrameOne of the many still images which compose a complete moving picture. “Frame” can also be used as a unit of time, but the actual duration of it depends on the frame rate at which it is displayed.
See Frames per second and Frame Rate.
Frames per secondNumber of frames in a second. If there 25 frames are displayed within a second, the duration of each frame is 40 milliseconds.
See Frame and Frame Rate.
Frame rateThe frequency at which consecutive video images (frames) are displayed, in rapid succession, to create the visual effect of motion.
Expressed in frames per second, or fps. Common frame rates include 24 fps, 25 fps and 30 fps. See Frame and Frames per second.
FPSSee Frames per second.
Graphics’ textText that is part of of the graphics of a video file, and therefore is not editable unless the source files are available.
There are several ways of dealing with the translation of graphics’ text, from forced narratives (see Forced Narrative) to full replacement
by reproducing the formatting of the original in the target language. Of course, the latter is much more costly and time-consuming than the former.
Interlingual bilingual subtitlesAlso called “dual” subtitles, they are subtitles in two different languages that are displayed simultaneously.
They are used in countries where the audience speaks different languages or as a language learning tool.
Interlingual subtitlesSubtitles in a language other than the original audio of the video.
Intralingual subtitlesSubtitles in the same language as the original audio of the video.
Line breaksLine breaks are the point where a subtitle line ends. Since the length of the lines in a subtitle is constrained, often a sentence doesn’t fit in one line or one subtitle.
Subtitles and lines should be broken at logical points, ideally at a piece of punctuation like a full stop, comma or dash. If that’s not possible, separating parts of speech such as an article and a noun
(e.g. the + bus, a + taxi) should be avoided. Other factors to keep into account when deciding the best layout for a subtitle are the visibility of the image behind the subtitles, the speakers’ position, and readability.
For example, two-liners (subtitles with two lines) should ideally be bottom heavy: the first line should be slightly shorter that the second, but not too short;
that could make the viewer read the second line before the first one.
Lip-sync dubbingAudiovisual translation technique in which each speaker’s voice is replaced with a corresponding foreign-language audio provided by a voice talent
who has been cast for that specific role. That audio matches the performance, pace and lip movement of the original.
The lip-sync dubbing approach is used in TV adverts and is the most widespread translation approach for films and TV series in some countries such as Spain, Italy and France.
Live subtitlingCreating subtitles for the deaf and hard of hearing in real time for live TV content, such as news bulletins or sports.
Nowadays, this is mostly done by a subtitler who listens to the broadcast and, as they do so, respeaks the dialogue to voice-recognition software trained to recognise their voice.
The software types out the text, which appears on screen with a lag of only a few seconds after the utterance.
Lower ThirdThe section of the video frame at the bottom third of the screen, where subtitles are placed by default.
Monolingual subtitlesSee Intralingual subtitles.
MultimodalityUse of more than one semiotic mode – i.e. use of more than one channel for creating meaning, such as speech, visuals, music, and so on – in meaning-making,
communication, and representation. Films do not rely solely on images to create meaning, but also on a range of elements including speech, written language and music.
Understanding the interrelation of the semiotic modes in meaning-making is very important when localising media for a different audience; since meaning is culturally bound,
the audience in the target language might not be able to identify or interpret all the non-verbal elements in the way that was intended.
Off-screen voice-overProduction technique in which there is an non-diegetic (i.e. not part of the action) narrator and their speech is timed to animations, titles,
or specific actions on screen. This technique is commonly used in corporate multimedia, tutorials, documentaries and marketing videos.
The script can be translated and voiced in a different language in order to replace the original audio track.
On-screen titlesText that appears in the video frame, excluding subtitles. For example: programme titles, location or time information, identifiers, etc.
OpenSaid of subtitles that cannot be turned on and off in a video. See Closed.
OSTSee On-screen titles.
Quality AssuranceStep during which the subtitled video is checked by a linguist other than the subtitler who created the subtitles.
Reading speedSpeed at which the subtitles are shown on screen, also referred to as ‘presentation rate’. It is usually measured in either characters per second (cps) or words per minute (wpm).
As the length of words in different languages differs, the cps measure is used more often in the audiovisual translation industry.
The recommended reading speed for subtitles depends on factors such as media type, target audience, complexity of the subject, etc.
RespeakingIn live subtitling, process of repeating what is heard, as it is being heard, into voice recognition software trained to that specific respeaker’s voice and pronunciation.
This software uses the audio input from the respeaker to generate the subtitle text.
Scrolling subtitleSubtitle in which the flow of text is never interrupted. It can be pre-prepared, in which the lines scroll up one after the other, or live,
in which words appear one by one and the line scrolls up when complete. Since this approach can accommodate a much higher word rate than block subtitles, they can be used for a verbatim rendering of speech.
SDSee Subtitles for the Deaf and Hard-of-Hearing.
Shot changesThe moment when a shot cuts and the next shot starts. It usually coincides with a change of angle or scene.
Source filesDigital assets used to create videos, like source footage, editable titles, etc.
SpottingSee Time-Coding.
SRTSee SubRip file.
SubRip fileSubtitles deliverable format commonly used for online streaming platforms like YouTube.
SubtitleWritten rendering of the dialogue or commentary in audiovisual media, displayed on the screen in sync with the audio.
They can either be a translation of a dialogue in a foreign language, or a written rendering of the dialogue in the same language.
In either case, they can have additional information on the soundtrack to make the video accessible to viewers who are deaf or hard of hearing.
They can also include on-screen titles or text in graphics like street signs or newspaper headlines.
Subtitle editorSoftware used to create and edit subtitles to be synchronised with a video file.
A subtitle editor usually includes a video preview, a space to input and edit text, a means of determining start and end times of subtitles,
and control over text formatting and positioning.
Subtitling guidelinesSet of instructions that the subtitler has to follow for a given producer or broadcaster. These instructions include what is the limit of characters per line, the maximum and minimum reading speed, etc.
Subtitling for the Deaf and Hard of HearingSubtitles that contain information to make them accessible to viewers who are hard-of-hearing.
In the US and Canada, SDH refers only to interlingual subtitles, while in most of the rest of the world SDH can be either intra or interlingual.
SDH combine all the audible and linguistic information necessary to make a video accessible.
SurtitlingSurtitles, also known as supertitles, SurCaps or OpTrans, are transcribed or translated lyrics/dialogue projected above a stage
on a supertitling machine or displayed on a screen, commonly used in opera or other musical performances and theatre plays. The surtitles consist of pre-prepared scrolling text
that an operator times in and out while watching the live performance.
Text-to-speech voice-overUN-style voice-over or off-screen narration that uses text-to-speech software (which transforms a text input into synthesised speech)
to give a voice to a translated script instead of recording voice-overs by human talents. See UN-style voice-over and Off-screen narration.
TemplateList of “master subtitles” with the in and out times already defined. This time-coded file is then used to create interlingual subtitles into as many languages as necessary.
The template can be prepared by a person other than the translator(s) in order to streamline the subtitling process.
Time codeTime-codes are numbers generated by a video at a specific intervals. These numbers are accurate down to milliseconds or assigned to the frames in the video – one number per frame. These are written as hours:minutes:seconds:frames, (e.g. 00:14:56:10). Subtitles are synced to the audio and the footage of a video by determining a start and end time-code for each subtitle.


“Time code” (TC) can also refer to the burnt-in timing information on a video.

Time-codingCreating a subtitle file or template. In a subtitle editor, a video and its transcription are broken up into subtitle-length segments
by inputting frame-accurate timing information for when each segment should appear on-screen and then disappear. Also known as “spotting” or “cueing”.
Time stampsTiming markers that are added to a transcript of an audio or video. Their purpose is helping to find specific parts of the transcript text in a long audio or video file,
or, vice versa, to find a specific utterance represented in writing in the transcript. Time stamps can be added at regular intervals or when certain events happen in the audio or video file (e.g. a change of speaker).
Usually, time stamps just contain minutes and seconds, as they don’t have to be more accurate than minutes and seconds to serve their purpose.
TranscriptionRepresentation of spoken language in written form. It can have varying degrees of fidelity to the original utterances of the speaker:
word-for-word verbatim transcription, intelligent transcription without interjections, or summarised transcription. Discourse transcription is a type of verbatim transcription
that adds information on tone of voice, pauses, etc.
Translated subtitlesSee Interlingual subtitles.
UN-style voice-overAudiovisual translation technique in which actor voices are recorded over the original audio track of the video, which can be heard in the background.
This method is most often used in documentaries and news reports to translate words of foreign-language interviewees. It can be single-voice translation, in which only one voice talent voices all the speakers
in the video, or multi-voice or a reduced group of voice talents voice all the speakers in the video. In some countries, like the Russia and Poland, it is commonly used instead of lip-sync dubbing
or subtitling to translate films and TV series. Also called “voice-over translation”.
Upper ThirdSection of the video frame at the top of the screen. Subtitles are placed there when there is already an on-screen title in the lower third of the screen.
Verbatim subtitlesSubtitles that reproduce the dialogue word for word.
Voice-overCan be synonymous with Off-screen voice-over or UN-style voice-over. In the video game industry, voice-over is used to refer to any recorded voice tracks added to the video game.
Voice-over translationSee UN-style voice-over.
Words per minuteReading speed measure in subtitling; number of words shown on screen in the space of a minute as a reading speed measure.
WPMSee Words per minute.

If you’d like to learn more translation terms, be sure to check out our localisation glossary!