Google Expands Low-Resource AI Translation with SMOL Dataset – slator.com

On February 17, 2025, Google released SMOL (Set of Maximal Overall Leverage), a dataset translated by professional translators aimed at improving machine translation (MT) for 115 low-resource languages (LRLs). SMOL consists of two components: SMOLSENT, a collection of 863 English sentences translated into 81 languages, and SMOLDOC, a dataset of 584 English documents translated into […]

Cascaded Speech Translation Systems Outperform End-to-End Models, Research Finds – slator.com

The first-ever voluntary mentorship program in speech translation, SpeechT, launched and led by Yasmin Moslem, NLP Researcher, brought together researchers, practitioners, and students from diverse companies and institutions worldwide to explore speech translation. Running from December 2024 to January 2025, the initiative introduced participants to data collection, model training, and advanced research techniques, helping them […]

Xiaomi’s Training Recipe for Better Multilingual AI Translation – slator.com

In a February 7, 2025 paper, researchers from Chinese tech company Xiaomi benchmarked the capabilities of open-source large language models (LLMs) with under ten billion parameters for multilingual machine translation (MT) tasks. They proposed the “best data recipe” to enhance AI translation performance. The researchers explained that open-source LLMs have shown improvements in multilingual capabilities, […]

Is Machine Translation Post-Editing Tedious? – slator.com

The language industry will remember 2024 as bringing an interesting mix of rapid-fire innovations and developments, with some clear trends emerging on the technology side, including translation as a feature (TaaF), multimodal AI adoption, retrieval augmented generation (RAG) applications, and large language model (LLM) customization. The balance between human expertise and AI automation continued to […]

Research Pits Traditional Machine Translation Against LLM-Powered AI Translation – slator.com

As large language models (LLMs) continue to transform translation workflows, a new study underscores the ongoing importance of conventional, domain-specific machine translation (MT) models. While recognizing the impact of LLMs on translation processes, the researchers emphasize the need for careful evaluation of workflows to ensure optimal outcomes. Previous research has shown that MT systems often […]

The Most Popular Language Industry Stories of 2024 – slator.com

As 2024 comes to a close, it is time to reflect on the most popular stories, trends, innovations, and themes that made the Slator headlines throughout the year, highlighting key developments in the language industry. Here is a selection of stories that attracted the most attention and engagement from our readers around the world. Will […]

Sony Aims to Improve AI Translation for Indian Language Entertainment Content – slator.com

In a December 29, 2024 paper, researchers Pratik Rakesh Singh, Mohammadi Zaki, and Pankaj Wasnik from Sony Research India introduced a framework aimed at improving translations for entertainment content in Indian languages. They claim this is “the first of its kind,” using a blend of context awareness and style adaptation to produce translations that are […]

How Microsoft Wants to Address Gender Bias in AI Speech Translation – slator.com

Gender bias in speech translation (ST) systems has long been a concern for researchers and users alike. In a January 10, 2025 paper, researchers from Microsoft Speech and Language Group presented their approach to addressing speaker gender bias in large-scale ST systems.  The researchers identified a persistent masculine bias in ST systems, even in cases […]

Alibaba Launches ‘Completely Revamped’ Translation Infrastructure – slator.com

Hangzhou (China)-based e-commerce and tech conglomerate Alibaba has announced that it has released a proprietary large language model (LLM) that is “better than products offered by Google, DeepL, and ChatGPT.” Shots fired. The proprietary LLM — known as “Marco MT” — is to be deployed across Alibaba’s existing translation offering launched last year to small […]