Last week Google Translate unveiled a new language option to supplement its other recent additions, which have included various Central and South East Asian and African languages. The difference being that this latest language in fact comes from a galaxy far, far away. In line with the hype surrounding the new Star Wars film, Google have decided to offer translation into and out of Aurebesh.
Aurebesh is the written format of Galactic Basic, the most widely spoken language in the Star Wars films. Its alphabet corresponds to the Latin alphabet; featuring the standard 26 letters along with some digraphs, numbers and punctuation.
We’ve used Google to translate some well-known quotes from the films; take a look at how they turned out:
By studying these examples just a little, we can see that Aurebesh essentially reads liked a coded version of any language that uses the Latin alphabet; it codes its source rather than translating it into an entirely different target language. This is because Galactic Base, its spoken form, is depicted in the Star Wars films as the language of the audience; be it English, Spanish etc. The advantage here is that it is really easy to decipher!
We’d love to see Google take things further and add more complex fictional languages to the mix, but as of now it is unclear how this would actually work. Google’s translate feature typically runs as a statistical machine translation system, meaning it builds translation functionality from statistical data gathered from a group of texts (corpus) for each language. For a fictional language, it may be that there is simply not enough written material available to create a reliable corpus, so it may not be possible to simply input any text and get an accurate response; a more limited Translate feature might be more appropriate. In spite of this, we’d love to see them try; perhaps Elvish or Na’vi from Avatar could be next? Until then, we here at will certainly be having fun with some Star Wars code cracking…
2 December 2015 09:47