Should I Translate My Game Using Machine Translation or AI?
Localization
| 15 Mar 2024By Jennifer O’Donnell, with assistance and checks from Miguel González, Marina Ilari, Lucile Danilov, Alexis Biro, Elia Riciputi, edited by Wesley O’Donnell
Machine and AI (artificial intelligence) translation is a hot topic right now, especially in the games industry. I get why. It can be hard to justify localization costs when machine and AI translations are apparently “so good.” You’re not a native speaker of other languages but translation is just changing the words from one language to another, right? Surely these tools can translate your game accurately at a fraction of the cost of a human translator?
I understand these tools are very tempting, but let me explain what exactly they are, what you’re getting, and what you’re losing by using them.
What is Machine Translation?
Machine translation is when a piece of software translates text from one language into another without human intervention. It works by analyzing patterns in large volumes of bilingual data, and generating translations based on the probability of certain words within phrases occurring together in different languages.
Let’s take a look at the word “Resume.” On its own, the likelihood of this word coming up as “CV” (a resume) or “résumé” (a summary) in French with Google Translate is high. This is because what words a machine uses in certain situations depends on how it’s been programmed and the data pool it’s pulling from.
For example, Google Translate bases its results on publicly accessible documents which are available in multiple languages (such as UN documents and transcripts) as well as previous machine translations that have been corrected by humans. European languages have the largest data pool resulting in decent translations of commonly used phrases, but it offers poorer results for languages with smaller data pools such as Japanese, Korean, Chinese, Arabic, etc., as well as texts non-commonly used to create the data pool, such as creative works like novels and video games.
This is why Google Translate picked “CV” and “résumé” for the word “Resume” and not “reprendre” for “resume the game”.
How “smart” the machine translation is for certain languages and contexts depends on the data it feeds on and how it’s programmed.
How is Generative AI Translation Different from Machine Translation?
Generative AI translation has a similar basis for processing data like machine translation, only with much more complicated algorithms and a much larger pool of data (most of which is stolen private data and copyrighted works) in order to create the illusion of naturally spoken language.
AI translation, as it’s being used now, is still a statistical language model based on the likelihood that a word will be used based on the other words around it.
The idea is that an “AI” can understand language better based on the context, but the “context” is still dependent on what type of texts the machine was trained on. And many of these works have their own biases and limitations, which are then reflected in the translation.
Let’s take the Chinese, Japanese, or Korean languages, for example. Gender is not expressed in these languages in the same way that it is in European languages. As a result, an AI translating from these languages often confuses gender, sometimes even being inconsistent within the same paragraph. And translations into these languages directly translate “he” and “she”, which makes the target text read awkwardly to native speakers. AI even struggles with gender in languages like Spanish, Italian, and French, in which gender is intrinsically tied to grammar.
AI also can’t handle formal and informal tones in many languages, especially European languages, due to complexities in grammar, the nuances of formality, and regional language variations. Despite what many AI developers claim, AI still lacks an understanding of context and social norms, which lead to errors that a native speaker can spot a mile away.
Furthermore, AI can’t produce creative, inspired writing. If your game has engaging, unique text, carefully crafted to convey atmosphere and immersion, rest assured that AI will never be able to work around language barriers to replicate your unique vision. It might be able to re-create the words, but a lot of the subtext and character will be lost as a machine is not a native speaker of the language. A machine cannot know or understand the cultural contexts it is translating between, giving you, at best boring and bland text, or, at worst, complete nonsense. This is not what your paying customers deserve, especially if you want your game to resonate with audiences worldwide.
AI might be slightly more advanced than basic machine translation, but it’s still nowhere near the level of an experienced translator. The supposed benefits of using machine and AI translation are that they save time and cut costs. Great for any business looking to save money, but at what cost to the quality of the work?
What You’re Getting with Machine and AI Translation
Machine and AI translation are passable when used to translate texts which use a lot of the same patterns—legal and technical texts come to mind. When they’re trained on specific types of documents a company uses frequently then they might produce something readable. (Although I have seen a number of legal and technical translators complain about poor machine translations.)
When any-old machine translation is used, however, you can still get glaring errors, such as the wrong terminology or inappropriate wording, that would be noticeable to an experienced translator, but perhaps not to someone with only a passing understanding of the target language.
Some game companies have started using machine and AI translation for non-game text. Specifically low-visibility, non-creative content, such as player support, surveys, and terms and conditions. But even these need a professional to review the text and constant quality control to make sure the quality is continuously evaluated and up to standard.
These issues intensify within creative game localization. Game text consists of a mixture of technical and creative writing that is specific to each game’s systems, setting, and characters. This text is also beholden to further limitations, such as character and line restrictions in the UI. No two games, in design or text, are the same.
Machine learning tool used to translate Mother 3 for Game Boy Advance
The Japanese translation for a number keypad from Longvinter on Steam. (Original image from Twitter.)
But surely you can train machines to create accurate technical translations based on your game’s systems, and create accurate but creative dialogue and flavor text that’s appropriate for your game’s setting?
Let me ask you this: how would you train a machine to translate game text? With other people’s game localizations? Can you trust that other games’ writing and translations match your own game’s style? Would you be okay with letting your game be used to train a machine so others can use it for their game? (If the answer to the last one is no, make sure you confirm with a translation agency that they don’t use your text to train their software.)
As of now, there are no machine or AI programs which have been trained on video game localization. It would be incredibly difficult to do due to the fact that game text (including their translations) are copywritten. (Although copyright doesn’t stop some AI programs, finished, released video game text is at least difficult to take directly from games.) As such, many video game translation agencies used commonly used machine translation programs like DeepL and ChatGPT which are trained on bilingual data scraped from the internet.
This is why machine and AI translation for video games, and any creative translation in the entertainment industry, falls apart. The technology as it is now, is not designed for video games in mind, no matter what certain translation agencies say. The quality of those who claim their technology can handle video games is highly questionable.
Why Can’t You Hire a Translator to Edit Machine and AI Translation?
Some translation agencies claim they can mitigate the low-quality of machine translation by using native translators to post-edit the automated output text (referred to as MTPE, machine translation post-editing). But the quality of writing produced by post-editing is also variable and highly questionable.
MTPE isn’t just editing the text to make it sound good. In order to achieve high quality translation from MTPE, a translator must compare the source text with the machine produced translation and check it for accuracy in meaning, style and characterization, as well as terminology consistency.
However, MTPE pays incredibly low rates, usually 40%-70% of standard translation rates while only offering negligible time gain. This means a translator can either work twice as hard to create a high-quality translation for less than half their normal rate, or they can put in as much work as they’re getting paid for and only perform a cursory glance to make sure the machine translation is at least grammatically accurate, but not whether it’s accurate in meaning, style, or consistency.
This is why the vast majority of talented creative translators prefer to translate from scratch than do post-editing work.
At the end of the day, post-editing creative works that have been machine translated takes just as much time as translating from scratch, and produces a lower quality translation.
What You’re Losing by Using Machine and AI Translation
Machine and AI translation is not good enough to create good game text, and hiring linguists to perform post-editing runs the high risk of getting poor quality work. You’re not just losing quality, but you’re losing your audience’s trust.
Machine and AI translation is as much of a hot topic for consumers as it is for businesses. The actors strike and scandals about stolen artwork, novels, and other creative works have consumers questioning whether they should spend money on something that uses a machine to create it. Should someone bother to read what a human didn’t bother to write?
When audiences notice a piece of media has used machine or AI translation, the backlash has often been severe, sometimes enough to force companies to fix translations post-release in patches, or even re-translate the work from scratch.
And these examples are just from the English-speaking sphere. How would you know if audiences in South America or Asia are upset and boycotting your work because of shoddy machine translation if you don’t speak those languages?
How Can Developers Know if They Have a Good Translation?
Game developers trust that the translation agency they hire is providing them with the best quality work, but you’re never guaranteed to get a good translation, especially when machine translation is being used.
Whether you hire community translators, professional translators, or machine translation, it always pays to put some quality checks in your localization process.
You can do this by:
- Asking fellow developers who have received positive localization feedback or reputable associations such as the IGDA Localization SIG for translation agency recommendations.
- Telling the translation agency straight away that you want human translators, and not machine translation. (Ask for the translators’ names for transparency and request they be included in the credits.)
- Checking the agency will not use your text to train their machine or AI translation software (even if you do not request machine translation.)
- Giving detailed instructions to the translation agency on the type of translation you want, along with style guides and character information. (See High Quality Localization? Help Loc Help You! for more advice on this.)
- Confirming what quality checks the translation agency has in place.
- Hiring another translation agency or professional translators to review a section of the translation. (Make the criteria and style of translation you are expecting clear. Share the same instructions you gave the translation agency.)
- Having an extensive LQA period with plenty of time to fix errors and not just check that the text is implemented correctly. (See How to Get the Most from LQA for more information.)
Should I Use Machine or AI Translation for my Game?
Being a translator myself, my opinion on this topic is obviously biased. However, my colleagues and I have translated countless creative works and have seen first-hand the quality of the work produced by machines. As of today, machine and AI translation is not suitable for creative works.
They might seem like good cost and time-saving measures (which is certainly how large translation agencies are selling them), but the money and time you save in the translation will be spent elsewhere. Whether that’s you paying extra for a long LQA period to fix all the issues; or some poor editor or LQA tester doing underpaid or unpaid labor to double check and fix broken translations; or by you releasing a poorly localized game that international fans will critique; or by you needing to re-translate everything down the line.
Whether your game is a story-rich adventure full of character, or a fun puzzle game with minimal text, I highly recommend you use professional human translators who are specialized in video games. Otherwise, you will likely get regenerated bland text, that is highly likely incorrect.
You can ask fellow game developers who have had positive experiences and feedback on their localizations, or consult with reputable associations such as the IGDA Localization SIG for recommendations based on what you’re looking for and your budget.
If you decide to go with a large agency who promises high quality translation with a fast turnaround at a low rate, take it with a grain of salt. If they try to sell you on machine or AI translation as a way to further save money, take it with a whole block. Either way, it’s always best to explore different options and think about what would be best for your game.
Further Reading
As translation technology improves, game localizations are getting worse
As translation technology improves, game localizations are getting worse [Reaction on Reddit]
When a Global Journey Goes South: 10 Examples of Bad Translation
Emulator Can Now Use Machine Learning To Translate Games (Poorly)
Fans Slam Front Mission 2 Remake Over “Unacceptable” Translation
Front mission 2 remaster English translation is really rough [Reaction on Reddit]
Tabletop Simulator Used Google Translate for 29 Languages, Which… Yikes
The German Xbox UI is a mess (translation-wise) [Reaction on Resetera]
50+ Utah game developers and enthusiasts filled the hall...
Utah - Salt Lake City
8 Dec 2024It’s time for our yearly chapter board member elections....
North Carolina - Triangle
3 Dec 2024Ven a corferias y visitanos en el pabellón 18....
Colombia
10 Oct 2024