Audiovisual Translation Glossary

Welcome to our Audiovisual Translation Glossary! With this Audiovisual Translation Glossary we hope to save you the hassle of finding overly convoluted explanations of industry terms! Our glossary is full of simple, jargon-free definitions. We hope you find this useful!

Term	Translation
Audio description	Voice-over narration, placed during natural pauses in the audio of a video or theatre play and sometimes during dialogue if deemed necessary. Its purpose is to provide information on key visual elements, actions, etc. in audiovisual media for the benefit of the blind and visually impaired audience.
Block subtitle	Subtitle that appears on screen for a set number of seconds, before being replaced by the next subtitle. They are usually edited (i.e. speech is summarised), they are timed to shot change and their lines break at logical points in the text.
Burnt-in subtitle	Subtitles that can’t be turned on and off in a video, as they are embedded in the video frames. Initially, subtitles were optically printed onto a film strip – hence the term “burnt”.
Caption	In the United States and Canada “captions” are what is known in the United Kingdom and most other countries as “subtitles for the deaf and hard of hearing”. See *Subtitling for the Deaf and Hard of Hearing*.
Character limit	The maximum character length of a subtitle line (including number of letters, spaces, punctuation and symbols). It varies depending on in-country accessibility standards, medium (e.g. streaming platform, broadcast TV), language, etc. Character limits can vary significantly: for example, Netflix has a 42-character limit for alphabetical languages, while the number of characters should not exceed 37 per line according to the BBC.
Characters per second	Reading speed measure in subtitling; number of characters shown on screen in the space of a second.
Closed	Said of subtitles that can be turned on and off in a video. See *Open*.
CPS	See *Characters per second*.
Cueing	See *Time-coding*.
Dialogue replacement	Production technique in which dialogue is re-recorded after the filming process. In audiovisual translation, it consists of substituting the original audio with a translation of the dialogue that starts and ends where the original voice would. However, it’s not done in a frame-accurate way, and lip movements aren’t synchronized. This makes it a cheaper approach than lip-sync dubbing, and it is generally used for e-Learning scenarios, corporate presentations and informational videos.
Dubbing	See *Lip-sync dubbing*.
Edited subtitles	Subtitles that don’t render dialogue word-for-word, but instead summarise, paraphrase or omit speech to varying degrees due to space and reading speed constraints.
Forced Narrative	Text that clarifies speech or elements meant to be understood by the viewer of a video. They are displayed even when subtitles are not turned on. Forced narrative subtitles are used in the following cases: to subtitle foreign/fictional language that differs from the original language of the video but is meant to be understood, to translate original language graphics’ text, and to transcribe dialogue in the same language of the audio for clarity when audio is inaudible or distorted.
Frame	One of the many still images which compose a complete moving picture. “Frame” can also be used as a unit of time, but the actual duration of it depends on the frame rate at which it is displayed. See *Frames per second* and *Frame Rate*.
Frames per second	Number of frames in a second. If there 25 frames are displayed within a second, the duration of each frame is 40 milliseconds. See *Frame* and *Frame Rate*.
Frame rate	The frequency at which consecutive video images (frames) are displayed, in rapid succession, to create the visual effect of motion. Expressed in frames per second, or fps. Common frame rates include 24 fps, 25 fps and 30 fps. See *Frame* and *Frames per second*.
FPS	See *Frames per second*.
Graphics’ text	Text that is part of of the graphics of a video file, and therefore is not editable unless the source files are available. There are several ways of dealing with the translation of graphics’ text, from forced narratives (see *Forced Narrative*) to full replacement by reproducing the formatting of the original in the target language. Of course, the latter is much more costly and time-consuming than the former.
Interlingual bilingual subtitles	Also called “dual” subtitles, they are subtitles in two different languages that are displayed simultaneously. They are used in countries where the audience speaks different languages or as a language learning tool.
Interlingual subtitles	Subtitles in a language other than the original audio of the video.
Intralingual subtitles	Subtitles in the same language as the original audio of the video.
Line breaks	Line breaks are the point where a subtitle line ends. Since the length of the lines in a subtitle is constrained, often a sentence doesn’t fit in one line or one subtitle. Subtitles and lines should be broken at logical points, ideally at a piece of punctuation like a full stop, comma or dash. If that’s not possible, separating parts of speech such as an article and a noun (e.g. the + bus, a + taxi) should be avoided. Other factors to keep into account when deciding the best layout for a subtitle are the visibility of the image behind the subtitles, the speakers’ position, and readability. For example, two-liners (subtitles with two lines) should ideally be bottom heavy: the first line should be slightly shorter that the second, but not too short; that could make the viewer read the second line before the first one.
Lip-sync dubbing	Audiovisual translation technique in which each speaker’s voice is replaced with a corresponding foreign-language audio provided by a voice talent who has been cast for that specific role. That audio matches the performance, pace and lip movement of the original. The lip-sync dubbing approach is used in TV adverts and is the most widespread translation approach for films and TV series in some countries such as Spain, Italy and France.
Live subtitling	Creating subtitles for the deaf and hard of hearing in real time for live TV content, such as news bulletins or sports. Nowadays, this is mostly done by a subtitler who listens to the broadcast and, as they do so, respeaks the dialogue to voice-recognition software trained to recognise their voice. The software types out the text, which appears on screen with a lag of only a few seconds after the utterance.
Lower Third	The section of the video frame at the bottom third of the screen, where subtitles are placed by default.
Monolingual subtitles	See *Intralingual subtitles*.
Multimodality	Use of more than one semiotic mode – i.e. use of more than one channel for creating meaning, such as speech, visuals, music, and so on – in meaning-making, communication, and representation. Films do not rely solely on images to create meaning, but also on a range of elements including speech, written language and music. Understanding the interrelation of the semiotic modes in meaning-making is very important when localising media for a different audience; since meaning is culturally bound, the audience in the target language might not be able to identify or interpret all the non-verbal elements in the way that was intended.
Off-screen voice-over	Production technique in which there is an non-diegetic (i.e. not part of the action) narrator and their speech is timed to animations, titles, or specific actions on screen. This technique is commonly used in corporate multimedia, tutorials, documentaries and marketing videos. The script can be translated and voiced in a different language in order to replace the original audio track.
On-screen titles	Text that appears in the video frame, excluding subtitles. For example: programme titles, location or time information, identifiers, etc.
Open	Said of subtitles that cannot be turned on and off in a video. See *Closed*.
OST	See *On-screen titles.*
Quality Assurance	Step during which the subtitled video is checked by a linguist other than the subtitler who created the subtitles.
Reading speed	Speed at which the subtitles are shown on screen, also referred to as ‘presentation rate’. It is usually measured in either characters per second (cps) or words per minute (wpm). As the length of words in different languages differs, the cps measure is used more often in the audiovisual translation industry. The recommended reading speed for subtitles depends on factors such as media type, target audience, complexity of the subject, etc.
Respeaking	In live subtitling, process of repeating what is heard, as it is being heard, into voice recognition software trained to that specific respeaker’s voice and pronunciation. This software uses the audio input from the respeaker to generate the subtitle text.
Scrolling subtitle	Subtitle in which the flow of text is never interrupted. It can be pre-prepared, in which the lines scroll up one after the other, or live, in which words appear one by one and the line scrolls up when complete. Since this approach can accommodate a much higher word rate than block subtitles, they can be used for a verbatim rendering of speech.
SD	See *Subtitles for the Deaf and Hard-of-Hearing*.
Shot changes	The moment when a shot cuts and the next shot starts. It usually coincides with a change of angle or scene.
Source files	Digital assets used to create videos, like source footage, editable titles, etc.
Spotting	See *Time-Coding*.
SRT	See *SubRip file*.
SubRip file	Subtitles deliverable format commonly used for online streaming platforms like YouTube.
Subtitle	Written rendering of the dialogue or commentary in audiovisual media, displayed on the screen in sync with the audio. They can either be a translation of a dialogue in a foreign language, or a written rendering of the dialogue in the same language. In either case, they can have additional information on the soundtrack to make the video accessible to viewers who are deaf or hard of hearing. They can also include on-screen titles or text in graphics like street signs or newspaper headlines.
Subtitle editor	Software used to create and edit subtitles to be synchronised with a video file. A subtitle editor usually includes a video preview, a space to input and edit text, a means of determining start and end times of subtitles, and control over text formatting and positioning.
Subtitling guidelines	Set of instructions that the subtitler has to follow for a given producer or broadcaster. These instructions include what is the limit of characters per line, the maximum and minimum reading speed, etc.
Subtitling for the Deaf and Hard of Hearing	Subtitles that contain information to make them accessible to viewers who are hard-of-hearing. In the US and Canada, SDH refers only to interlingual subtitles, while in most of the rest of the world SDH can be either intra or interlingual. SDH combine all the audible and linguistic information necessary to make a video accessible.
Surtitling	Surtitles, also known as supertitles, SurCaps or OpTrans, are transcribed or translated lyrics/dialogue projected above a stage on a supertitling machine or displayed on a screen, commonly used in opera or other musical performances and theatre plays. The surtitles consist of pre-prepared scrolling text that an operator times in and out while watching the live performance.
Text-to-speech voice-over	UN-style voice-over or off-screen narration that uses text-to-speech software (which transforms a text input into synthesised speech) to give a voice to a translated script instead of recording voice-overs by human talents. See *UN-style voice-over* and *Off-screen narration*.
Template	List of “master subtitles” with the in and out times already defined. This time-coded file is then used to create interlingual subtitles into as many languages as necessary. The template can be prepared by a person other than the translator(s) in order to streamline the subtitling process.
Time code	Time-codes are numbers generated by a video at a specific intervals. These numbers are accurate down to milliseconds or assigned to the frames in the video – one number per frame. These are written as hours:minutes:seconds:frames, (e.g. 00:14:56:10). Subtitles are synced to the audio and the footage of a video by determining a start and end time-code for each subtitle. “Time code” (TC) can also refer to the burnt-in timing information on a video.
Time-coding	Creating a subtitle file or template. In a subtitle editor, a video and its transcription are broken up into subtitle-length segments by inputting frame-accurate timing information for when each segment should appear on-screen and then disappear. Also known as “spotting” or “cueing”.
Time stamps	Timing markers that are added to a transcript of an audio or video. Their purpose is helping to find specific parts of the transcript text in a long audio or video file, or, vice versa, to find a specific utterance represented in writing in the transcript. Time stamps can be added at regular intervals or when certain events happen in the audio or video file (e.g. a change of speaker). Usually, time stamps just contain minutes and seconds, as they don’t have to be more accurate than minutes and seconds to serve their purpose.
Transcription	Representation of spoken language in written form. It can have varying degrees of fidelity to the original utterances of the speaker: word-for-word verbatim transcription, intelligent transcription without interjections, or summarised transcription. Discourse transcription is a type of verbatim transcription that adds information on tone of voice, pauses, etc.
Translated subtitles	See Interlingual subtitles.
UN-style voice-over	Audiovisual translation technique in which actor voices are recorded over the original audio track of the video, which can be heard in the background. This method is most often used in documentaries and news reports to translate words of foreign-language interviewees. It can be single-voice translation, in which only one voice talent voices all the speakers in the video, or multi-voice or a reduced group of voice talents voice all the speakers in the video. In some countries, like the Russia and Poland, it is commonly used instead of lip-sync dubbing or subtitling to translate films and TV series. Also called “voice-over translation”.
Upper Third	Section of the video frame at the top of the screen. Subtitles are placed there when there is already an on-screen title in the lower third of the screen.
Verbatim subtitles	Subtitles that reproduce the dialogue word for word.
Voice-over	Can be synonymous with Off-screen voice-over or UN-style voice-over. In the video game industry, voice-over is used to refer to any recorded voice tracks added to the video game.
Voice-over translation	See *UN-style voice-over*.
Words per minute	Reading speed measure in subtitling; number of words shown on screen in the space of a minute as a reading speed measure.
WPM	See *Words per minute*.

If you’d like to learn more translation terms, be sure to check out our localisation glossary!

Audiovisual Translation Glossary

This Audiovisual Translation Glossary provides simple, jargon-free definitions of the most widely used terms in audiovisual translation!

Term

Translation