In a blog post, Google researchers describe how the natural language processing systems benchmark can evaluate cross-lingual generalization capabilities using nine inference tasks for a dozen language families and 40 languages. While models tested on English come close to humans on most existing tasks, the performance is substantially lower for many other languages, Google Research senior software engineer Melvin Johnson and DeepMind scientist Sebastian Ruder wrote in the post. “Overall, a large gap between performance in English and other languages remains across all models and settings, which indicates that there is much potential for research on cross-lingual transfer," they noted. The goal of Xtreme, then, is to encourage more research in AI-based multilingual learning, according to Google. You can read the preprint paper on Xtreme here, while the code and examples are available on GitHub.
This story first appeared in Inside AI.