Skip to main content

Workshop

Machine Translation for the Scientific Domain

By Stefania Amodeo, Sokratis Sofianopoulos

From June 24th to 27th, 2024, Sheffield, UK, hosted the 25th Annual Conference of The European Association for Machine Translation (EAMT). This event gathered leading researchers and companies in the field of Machine Translation to discuss the latest advancements, methodologies, and challenges. Representing SciLake were our partners from Athena Research Centre: Dimitris Roussis, Sokratis Sofianopoulos, and Stelios Piperidis.

The conference covered a wide array of topics, including linguistic resources, evaluation techniques, and multilingual technologies. The audience comprised researchers and industry professionals dedicated to pushing the boundaries of Machine Translation.

SciLake participated in a poster session highlighting innovative work on translation models. The presentation included a conference paper titled "Enhancing Scientific Discourse: Machine Translation for the Scientific Domain," which outlines the development of parallel and monolingual corpora for the scientific field, focusing on language pairs like Spanish-English, French-English, and Portuguese-English. These corpora include a comprehensive general scientific corpus and four specialised corpora targeting the project’s research areas: Cancer Research, Energy Research, Neuroscience, and Transportation Research. The paper details the corpus creation process, the fine-tuning strategies used, and concludes with a thorough evaluation of the results.

This work aims to bridge the language gap in scientific research by developing general-purpose neural machine translation (NMT) models that can accurately and fluently translate scientific text across various specialized domains, ensuring that critical research is accessible to a global audience.

The draft versions of the conference proceedings are available online at https://eamt2024.sheffield.ac.uk/programme/proceedings, with the final publication expected to be published in the ACL Anthology.