How to Balance Cost and Quality in AI Translation Evaluation – slator.com

As large language models (LLMs) gain prominence as state-of-the-art evaluators, prompt-based evaluation methods like GEMBA-MQM have emerged as powerful tools for assessing translation quality. However, LLM-based evaluation is expensive and computationally demanding, requiring vast amounts of tokens and incurring significant API call expenses. Scaling evaluation to large datasets quickly becomes impractical, raising a key question: […]

Meta’s BOUQuET Benchmark Brings Linguistic Diversity to AI Translation Evaluation – slator.com

On February 6, 2025, Meta unveiled BOUQuET, a comprehensive dataset and benchmarking initiative aimed at improving multilingual machine translation (MT) evaluation.  This development aligns with Meta’s ongoing efforts to source diverse AI translation data through collaborative partnerships. The researchers noted that existing datasets and benchmarks often fall short due to their English-centric focus, narrow range […]