Saudi Arabia signs localization agreements for wind energy steel towers – Arab News
2024 Slator Pro Guide: Translation AI
The 2024 Slator Pro Guide presents 20 new and impactful ways that LLMs can be used to enhance translation workflows.
Saudi Arabia signs localization agreements for wind energy steel towers – Arab News
Enterprise — Essential business, finance and policy news from the Arab world – Enterprise News
Ministry begins industry consultation on e-truck subsidy allocation, localization in focus | Mint – Mint
Meet QueEn, a Large Language Model for Quechua-English Translation – slator.com
[ad_1]
In a December 6, 2024 paper, researchers at the University of Georgia introduced QueEn, a large language model (LLM) tailored for Quechua-English translation.
Quechua, also known as Runa Simi (the “language of the people”), is one of the most significant indigenous languages of the Americas. Once the administrative language of the Inca Empire, it remains widely spoken in Peru, Bolivia, Ecuador, Colombia, Argentina, and Chile. However, Quechua is listed as “vulnerable” by UNESCO due to a decline in intergenerational language transmission and the growing dominance of Spanish, particularly among younger speakers.
There have been efforts to preserve and revitalize Quechua, with national governments integrating it into school curricula and granting it official language status. However, the researchers caution that these initiatives have been “inconsistent and often underfunded,” warning that “the long-term viability of Quechua remains uncertain without more robust and sustained revitalization measures.”
Translating Quechua remains a challenge for machine translation (MT) due to its agglutinative and polysynthetic structure, where single words often encode extensive grammatical and semantic information through multiple morphemes. This, combined with limited digital resources, makes it difficult for conventional MT models to process and generate accurate translations, according to the researchers.
To address these challenges, QueEn incorporates retrieval-augmented generation (RAG) to dynamically pull external linguistic resources, such as Quechua-English dictionaries and grammar guides, during translation. It also employs low-rank adaptation (LoRA) to efficiently fine-tune pre-trained models for Quechua’s linguistic characteristics. This enabled QueEn to achieve a BLEU score of 17.6 — a stark improvement over baseline models, like GPT-4o and LLaMA 405B, which scored just 1.5.
The 2024 Slator Pro Guide presents 20 new and impactful ways that LLMs can be used to enhance translation workflows.
According to the researchers, QueEn’s potential extends beyond academic research, addressing critical communication gaps in areas like education, healthcare, and legal support for Quechua-speaking communities. It also contributes to cultural preservation by ensuring Quechua remains a vibrant and functional language in the digital age.
“Effective translation tools for indigenous languages like Quechua can help bridge socioeconomic gaps in multilingual societies while preserving endangered languages for future generations,” the researchers emphasized.
While QueEn’s results are very good, the researchers see opportunities for further improvement, such as integrating audio data and speech recognition technologies to capture Quechua’s strong oral tradition and implementing morpheme-aware tokenization strategies to better handle its complex morphology.
Community involvement is also important. Integrating feedback from native Quechua speakers and adopting a human-in-the-loop approach “would not only improve translation accuracy but also ensure that the technology serves the needs of the Quechua-speaking community,” they added.
Concluding the researchers highlighted that their methodology could be adapted to other endangered languages, supporting global efforts to promote linguistic diversity. “The techniques developed for QueEn could be adapted for other low-resource languages, particularly those facing similar challenges of cultural and linguistic preservation,” they said.
Authors: Junhao Chen, Peng Shu, Yiwei Li, Huaqin Zhao, Hanqi Jiang, Yi Pan, Yifan Zhou, Zhengliang Liu, Lewis C Howe, and Tianming Liu
[ad_2]
Source link