Large language models (LLMs) have significantly improved AI translation for high-resource languages, but performance remains uneven for low-resource languages (LRLs).
In a March 6, 2025 paper, researchers Armel Zebaze, Benoît Sagot, and Rachel Bawden from Inria, the French National Institute for Research in Digital Science and Technology, introduced Compositional Translation (CompTra), an LLM-based approach designed to improve translation quality for LRLs.
CompTra prompts LLMs to break down sentences into simpler phrases, translate each separately using in-context examples, and then use these phrase-translation pairs to guide the translation of the original sentence.
The researchers explained that “LLMs are more effective at handling short phrases” and that simpler phrases make it easier to find relevant in-context examples. By breaking down complex sentences into simpler phrases, CompTra makes the translation task more manageable for LLMs and enables more effective use of limited in-context examples.
Compositionality
CompTra works by first breaking down the input sentence into simpler, independent phrases that capture some of its aspects and use its words in the same context. For each decomposed phrase, the system retrieves relevant in-context examples (typically 4) from a selection pool.
With these examples in place, the LLM translates each phrase individually in a few-shot setting, leveraging the retrieved examples as references. This process results in translated outputs for each of the decomposed phrases.
To maintain accuracy, a language identification step is applied to verify that all translations align with the intended target language. Any outputs that fail this check are discarded.
In the final stage, the LLM is prompted once again to translate the original sentence, incorporating the successfully translated phrase-translation pairs as additional context. This helps guide the model in generating a final translation that integrates the insights derived from the previously translated components, leading to a more accurate and contextually appropriate result.
The researchers tested CompTra on translations from English into 15 different LRLs, using datasets including FLORES 200, NTREX 128, and TICO-19. They used several LLMs including LLaMA 3.1 (8B, 70B), Gemma 2 (9B, 27B), and Command-R+.
They found that CompTra consistently outperformed standard similarity-based few-shot translation, with gains measured using XCOMET and MetricX evaluation metrics. Gains were particularly evident when working with smaller selection pools or out-of-domain data.
The researchers hope that “applying compositionality to perform [machine translation] MT will hopefully inspire further work on reasoning-based approaches to MT.”
The code and outputs are publicly available, encouraging further experimentation and adoption.