How to Balance Cost and Quality in AI Translation Evaluation – slator.com

As large language models (LLMs) gain prominence as state-of-the-art evaluators, prompt-based evaluation methods like GEMBA-MQM have emerged as powerful tools for assessing translation quality. However, LLM-based evaluation is expensive and computationally demanding, requiring vast amounts of tokens and incurring significant API call expenses. Scaling evaluation to large datasets quickly becomes impractical, raising a key question: […]

Which Parts of a Prompt Should Be Translated to Improve Large Language Models? – slator.com

On February 13, 2025, researchers Itai Mondshine, Tzuf Paz-Argaman, and Reut Tsarfaty from Bar-Ilan University suggested that translating only specific components of a prompt can improve the performance of multilingual large language models (LLMs) across various natural language processing (NLP) tasks. This research builds on prior work by Google, Alibaba, the Pune Institute of Computer […]