Large language models (LLMs) remain a topic of deep interest across the language industry, headlining conferences and analyses almost daily.
But since 2023, a number of language service providers (LSPs) have taken the leap from theoretical to practical, and have incorporated LLMs into their translation workflows.
More specifically, LSPs have begun to use LLMs for machine translation (MT). The 2024 edition of the Association of Language Companies (ALC) Industry Survey, carried out by Slator, found that of 70 companies using MT in their workflows, nearly a third — 29% — said they used LLMs to produce MT.
This figure is a jump from 2023 when just 11% of companies offering MT said their output came from LLMs. What can account for the 18% increase over the span of a year?
In a thought-provoking presentation at SlatorCon Silicon Valley, eBay’s Silvio Picinini touched on the advantages of LLM-powered language AI over traditional resources for translators, including CAT tools and MT.
LLMs can draw on vast amounts of data, including, within a given enterprise, the translations from all involved translators, reviewers, and subject matter experts. Language AI can also use that data to translate with full context, even incorporating images and other multimodal data.
MT models, on the other hand, can inadvertently divorce content from context, making the task at hand unnecessarily challenging for linguists.
Picinini noted that while AI can be expensive, the costs of LLMs are decreasing; he predicts that language AI will become faster and more capable over time.
Getting Better All the Time
AI scientists and engineers in academia and industry continue to make strides in improving LLMs, and at a fast clip.
To take just one recent example, LLMs often perform worse than standard MT models in specialized domains, such as medicine. But researchers have already had success with “instruction tuning” (i.e., fine-tuning models using datasets from various tasks formatted as instructions) models from Google, Meta, and Unbabel to “significantly” outperform baselines.
Other techniques rely on the existing infrastructure at LSPs to improve LLM performance. One team found that leveraging translation memories (TMs) can improve translation quality, reduce turnaround times, and offer cost savings, demonstrating an especially high return on investment for low-resource languages.
2024 Slator Pro Guide: Translation AI
The 2024 Slator Pro Guide presents 20 new and impactful ways that LLMs can be used to enhance translation workflows.
And, in an art-imitating-life experiment, researchers at Google explored improving LLM translation quality by mimicking human translation workflows, paying particular attention to “sub-tasks that navigate a bilingual landscape.” The process outperformed zero-shot translation.
As the lines between human and automated translation continue to cross, if not blur, it is also worth noting that 16% of ALC Survey respondents said they are building or have built their MT model in-house, up from just 3% in 2023.
Time will tell whether these companies ultimately stick with MT models to the exclusion of LLMs, or somehow combine methods for better results.