LowResNLP 2025 Proceedings Home | LowResNLP 2025 WEBSITE | RANLP WEBSITE

Proceedings of the First Workshop on Advancing NLP for Low-Resource Languages

Chairs
Ernesto Luis Estevanell-Valladares
Alicia Picazo-Izquierdo
Tharindu Ranasinghe
Besik Mikaberidze
Simon Ostermann
Daniil Gurgurov
Philipp Mueller
Claudia Borg
Marián Šimko

Full proceedings volume (PDF)
Schedule and Author index (HTML)
Bibliography (BibTeX)
Live website


Front matter [pdf] [bib] pages
Bridging the Gap: Leveraging Cherokee to Improve Language Identification for Endangered Iroquoian Languages
Liam Enzo Eggleston, Michael P. Cacioli, Jatin Sarabu, Ivory Yang and Kevin Zhu
[pdf] [bib] [optional] [supplementary]
pp. 1‑6
Building a Lightweight Classifier to Distinguish Closely Related Language Varieties with Limited Supervision: The Case of Catalan vs Valencian
Raúl García Cerdá, María Miró Maestre and Miquel Canal
[pdf] [bib]
pp. 7‑11
A thresholding method for Improving translation Quality for Indic MT task
Sudhansu Bala Das, Leo Raphael Rodrigues, Tapas Kumar Mishra and Bidyut Ku Patra
[pdf] [bib]
pp. 12‑20
A Multi-Task Learning Approach to Dialectal Arabic Identification and Translation to Modern Standard Arabic
Abdullah Khered, Youcef Benkhedda and Riza Batista-Navarro
[pdf] [bib]
pp. 21‑31
Low-Resource Machine Translation for Moroccan Arabic
Alexei Rosca, Abderrahmane Issam and Gerasimos Spanakis
[pdf] [bib]
pp. 32‑38
Efficient Architectures For Low-Resource Machine Translation
Edoardo Signoroni, Pavel Rychly and Ruggero Signoroni
[pdf] [bib]
pp. 39‑64
IfGPT: A Dataset in Bulgarian for Large Language Models
Svetla Peneva Koeva, Ivelina Stoyanova and Jordan Konstantinov Kralev
[pdf] [bib]
pp. 65‑75
Modular Training of Deep Neural Networks for Text Classification in Guarani
Jose Luis Vazquez, Carlos Ulises Valdez, Marvin Matías Agüero-Torales, Julio César Mello-Román, Jose Domingo Colbes and Sebastian Alberto Grillo
[pdf] [bib]
pp. 76‑81
Roman Urdu as a Low-Resource Language: Building the First IR Dataset and Baseline
Muhammad Umer Tariq Butt, Stalin Varanasi and Guenter Neumann
[pdf] [bib]
pp. 82‑87
The Brittle Compass: Navigating LLM Prompt Sensitivity in Slovak Migration Media Discourse
Jaroslav Kopčan, Samuel Harvan and Marek Suppa
[pdf] [bib]
pp. 88‑101
Explicit Edge Length Coding to Improve Long Sentence Parsing Performance
Khensa Daoudi, Mathieu Dehouck, Rayan Ziane and Natasha Romanova
[pdf] [bib]
pp. 102‑110
Evaluating LLM Capabilities in Low-Resource Contexts: A Case Study of Persian Linguistic and Cultural Tasks
Jasmin Heierli, Rebecca Bahar Ganjineh and Elena Gavagnin
[pdf] [bib] [optional] [supplementary]
pp. 111‑120
A Benchmark for Evaluating Logical Reasoning in Georgian For Large Language Models
Irakli Koberidze, Archil Elizbarashvili and Magda Tsintsadze
[pdf] [bib]
pp. 121‑130
Slur and Emoji Aware Models for Hate and Sentiment Detection in Roman Urdu Transgender Discourse
Muhammad Owais Raza, Aqsa Umar and Mehrub Awan
[pdf] [bib]
pp. 131‑139
Automatic Fact-checking in English and Telugu
Ravi Kiran Chikkala, Tatiana Anikina, Natalia Skachkova, Ivan Vykopal, Rodrigo Agerri and Josef van Genabith
[pdf] [bib]
pp. 140‑151
Synthetic Voice Data for Automatic Speech Recognition in African Languages
Brian DeRenzi, Anna Dixon, Mohamed Aymane Farhi and Christian Resch
[pdf] [bib]
pp. 152‑186
ADOR: Dataset for Arabic Dialects in Hotel Reviews: A Human Benchmark for Sentiment Analysis
Maram I. Alharbi, Saad Ezzini, Hansi Hettiarachchi, Tharindu Ranasinghe and Ruslan Mitkov
[pdf] [bib]
pp. 187‑191
Towards Creating a Bulgarian Readability Index
Dimitar Kazakov, Stefan Minkov, Ruslana Margova, Irina Temnikova and Ivo Emauilov
[pdf] [bib]
pp. 192‑200

Last modified on October 27, 2025, 9:20 a.m.