Google, Unbabel Expand Key AI Translation Benchmark to 55 Languages with WMT24++ – slator.com

Researchers from Google and Unbabel have unveiled WMT24++, a major expansion of the WMT24 machine translation (MT) benchmark, extending its language coverage from 9 to 55 languages and dialects. The dataset now includes human-written reference translations and post-edits for 46 additional languages, as well as new post-edits of the references for 8 of the original […]
Google Expands Low-Resource AI Translation with SMOL Dataset – slator.com

On February 17, 2025, Google released SMOL (Set of Maximal Overall Leverage), a dataset translated by professional translators aimed at improving machine translation (MT) for 115 low-resource languages (LRLs). SMOL consists of two components: SMOLSENT, a collection of 863 English sentences translated into 81 languages, and SMOLDOC, a dataset of 584 English documents translated into […]