Call for Papers Special Issue

Machine Translation Journal
Special Issue on Machine Translation for Low-Resource Languages

GUEST EDITORS (Listed alphabetically)
• Alina Karakanta (FBK-Fondazione Bruno Kessler)
• Audrey N. Tong (NIST)
• Chao-Hong Liu (ADAPT Centre/Dublin City University)
• Ian Soboroff (NIST)
• Jonathan Washington (Swarthmore College)
• Oleg Aulov (NIST)
• Xiaobing Zhao (Minzu University of China)

Machine translation (MT) technologies have been improved significantly in the last two decades, with developments in phrase-based statistical MT (SMT) and recently neural MT (NMT). However, most of these methods rely on the availability of large parallel data for training the MT systems, resources which are not available for the majority of language pairs, and hence current technologies often fall short in their ability to be applied to low-resource languages. Developing MT technologies using relatively small corpora still presents a major challenge for the MT community. In addition, many methods for developing MT systems still rely on several natural language processing (NLP) tools to pre-process texts in source languages and post-process MT outputs in target languages. The performance of these tools often has a great impact on the quality of the resulting translation. The availability of MT technologies and NLP tools can facilitate equal access to information for the speakers of a language and determine on which side of the digital divide they will end up. The lack of these technologies for many of the world's languages provides opportunities both for the field to grow and for making tools available for speakers of low-resource languages.

In recent years, several workshops and evaluations have been organized to promote research on low-resource languages. NIST has been conducting Low Resource Human Language Technology evaluations (LoReHLT) annually from 2016 to 2019. In LoReHLT evaluations, there is no training data in the evaluation language. Participants receive training data in related languages, but need to bootstrap systems in the surprise evaluation language at the start of the evaluation. Methods for this include pivoting approaches and taking advantage of linguistic universals. The evaluations are supported by DARPA's Low Resource Languages for Emergent Incidents (LORELEI) program, which seeks to advance technologies that are less dependent on large data resources and that can be quickly pivoted to new languages within a very short amount of time so that information from any language can be extracted in a timely manner to provide situation awareness to emergent incidents. There are also the Workshop on Technologies for MT of Low-Resource Languages (LoResMT) and the Workshop on Deep Learning Approaches for Low-Resource Natural Language Processing (DeepLo), which provide a venue for sharing research and working on the research and development in this field.

This special issue solicits original research papers on MT systems/methods and related NLP tools for low-resource languages in general. LoReHLT, LORELEI, LoResMT and DeepLo participants are very welcome to submit their work to the special issue. Summary papers on MT research for specific low-resource languages, as well as extended versions (>40% difference) of published papers from relevant conferences/workshops are also welcome.

Topics of the special issue include but are not limited to:
 * Research and review papers of MT systems/methods for low-resource languages
 * Research and review papers of pre-processing and/or post-processing NLP tools for MT
 * Word tokenizers/de-tokenizers for low-resource languages
 * Word/morpheme segmenters for low-resource languages
 * Use of morphological analyzers and/or morpheme segmenters in MT
 * Multilingual/cross-lingual NLP tools for MT
 * Review of available corpora of low-resource languages for MT
 * Pivot MT for low-resource languages
 * Zero-shot MT for low-resource languages
 * Fast building of MT systems for low-resource languages
 * Re-usability of existing MT systems and/or NLP tools for low-resource languages
 * Machine translation for language preservation
 * Techniques that work across many languages and modalities
 * Techniques that are less dependent on large data resources
 * Use of language-universal resources
 * Bootstrap trained resources for short development cycle
 * Entity-, relation- and event-extraction
 * Sentiment detection
 * Summarization
 * Processing diverse languages, genres (news, social media, etc.) and modalities (text, speech, video, etc.)

November 26, 2019: Expression of interest (EOI)
February 25, 2020: Paper submission deadline
July 7, 2020: Camera-ready papers due
December, 2020: Publication

- For EOI, please submit via the link:
- For paper submission, please go to the MT journal website and select this special issue
- Authors should follow the "Instructions for Authors"
- Recommended length of paper is 15 pages