Large Language Models Struggle to Evaluate Long AI Translations, Amazon Finds – slator.com

A new study from Amazon has revealed a limitation in using large language models (LLMs) to evaluate AI translation quality: performance drops as input length increases. While LLMs are increasingly used for high-quality sentence-level AI translation evaluation, the study finds that these models become “less reliable when evaluating long-form translation outputs.” Amazon researchers Tobias Domhan […]
Lessons from AI Translation to Improve Multilingual LLM Evaluation – slator.com

As large language models (LLMs) continue to scale across languages, their evaluation frameworks are struggling to keep pace. Two recent studies — one from Alibaba and academic partners, the other from a collaboration between Cohere and Google — highlight critical challenges in multilingual LLM evaluation. “As large language models continue to advance in linguistic capabilities, […]
Alibaba and Meta Face Off in Simultaneous AI Translation – slator.com

Two recent research papers — one from a team at Alibaba and Xiamen University, the other from Meta AI —- proposed different ways to improve simultaneous AI translation (SiMT): by adapting large language models (LLMs) for real-time use, and by extending translation systems with a lightweight streaming module. The Alibaba and Xiamen University researchers introduced […]
Should AI Translations Qualify for Copyright Protection? – slator.com

In March 2025, a U.S. Court of Appeals ruled (PDF) that purely AI-generated works cannot receive copyright protection. AI translation is — of course — an example of AI-generated works, and with intellectual property (IP) in translation long being a hot potato in the US and beyond, translators are increasingly crying foul over the use […]
How Large Language Models Improve Document-Level AI Translation – slator.com

In an era when large language models (LLMs) are reshaping AI translation, two recent studies have emerged with strategies to tackle document-level machine translation (MT). One from the University of Zurich introduces a method that treats document-level translation as conversation. Instead of translating a whole document in one go or splitting it into isolated segments, […]
How France’s Inria Aims to Improve AI Translation for Low-Resource Languages – slator.com

Large language models (LLMs) have significantly improved AI translation for high-resource languages, but performance remains uneven for low-resource languages (LRLs). In a March 6, 2025 paper, researchers Armel Zebaze, Benoît Sagot, and Rachel Bawden from Inria, the French National Institute for Research in Digital Science and Technology, introduced Compositional Translation (CompTra), an LLM-based approach designed […]
Unbabel Tackles Metric Bias in AI Translation – slator.com

In a March 11, 2025 paper, Unbabel introduced MINTADJUST, a method for more accurate and reliable machine translation (MT) evaluation. MINTADJUST addresses metric interference (MINT), a phenomenon where using the same or related metrics for both model optimization and evaluation leads to over-optimistic performance estimates. The researchers identified two scenarios where MINT commonly occurs and […]
Alibaba Says Large Reasoning Models Are Redefining AI Translation – slator.com

In a March 14, 2025 paper, researchers from Alibaba‘s MarcoPolo Team explored the translation capabilities of large reasoning models (LRMs) like OpenAI’s o1 and o3, DeepSeeks’s R1, Anthropic’s Claude 3.7 Sonnet, or xAI’s Grok 3, positioning them as “the next evolution” in translation beyond neural machine translation (NMT) and large language models (LLMs). They explained […]
How to Balance Cost and Quality in AI Translation Evaluation – slator.com

As large language models (LLMs) gain prominence as state-of-the-art evaluators, prompt-based evaluation methods like GEMBA-MQM have emerged as powerful tools for assessing translation quality. However, LLM-based evaluation is expensive and computationally demanding, requiring vast amounts of tokens and incurring significant API call expenses. Scaling evaluation to large datasets quickly becomes impractical, raising a key question: […]
How Microsoft Wants to Address Gender Bias in AI Speech Translation – slator.com

Gender bias in speech translation (ST) systems has long been a concern for researchers and users alike. In a January 10, 2025 paper, researchers from Microsoft Speech and Language Group presented their approach to addressing speaker gender bias in large-scale ST systems. The researchers identified a persistent masculine bias in ST systems, even in cases […]