Google Expands Low-Resource AI Translation with SMOL Dataset – slator.com

On February 17, 2025, Google released SMOL (Set of Maximal Overall Leverage), a dataset translated by professional translators aimed at improving machine translation (MT) for 115 low-resource languages (LRLs). SMOL consists of two components: SMOLSENT, a collection of 863 English sentences translated into 81 languages, and SMOLDOC, a dataset of 584 English documents translated into […]

Meta’s BOUQuET Benchmark Brings Linguistic Diversity to AI Translation Evaluation – slator.com

On February 6, 2025, Meta unveiled BOUQuET, a comprehensive dataset and benchmarking initiative aimed at improving multilingual machine translation (MT) evaluation.  This development aligns with Meta’s ongoing efforts to source diverse AI translation data through collaborative partnerships. The researchers noted that existing datasets and benchmarks often fall short due to their English-centric focus, narrow range […]