MBZUAI Secures $1mn from Google.org to Tackle AI’s Arabic Data Gap

Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) has received $1 million in funding from Google.org to develop high-performance AI systems tailored to the linguistic realities of the Middle East and North Africa (MENA), as concerns grow over the dominance of English-language data in global AI models.

The funding will support a research initiative led by Dr. Thamar Solorio, Vice Provost of Faculty Excellence and Advancement and Professor of Natural Language Processing at MBZUAI. The project aims to address what researchers describe as a widening “data divide” — the imbalance that leaves speakers of underrepresented languages with less accurate and less culturally aware AI systems.

While AI research continues to advance rapidly, much of the data used to train large language models remains concentrated in English and other Western languages. When deployed in the MENA region, these systems often struggle with dialect variation, cultural nuance and context-specific meaning.

Professor Solorio said the grant would allow her team to move beyond adapting Western-built models to building frameworks grounded in regional realities.

“This funding allows us to take our research from an early exploratory phase to a level that can not only redefine the field, but lead to impact in people’s lives,” she said. “This support is vital because it allows us to move beyond adaptation of high-resource models to linguistically grounded AI for MENA languages, which highlights a much-needed paradigm shift in the field.”

Rethinking the Model-Building Playbook

Arabic presents a particular challenge for AI systems. Modern Standard Arabic coexists with multiple spoken dialects across the Gulf, Levant and North Africa. A single sentiment can be expressed in numerous ways depending on geography, culture and social context.

Most global large language models are trained on data that heavily underrepresents these variations. As a result, AI systems may produce grammatically correct output that lacks local nuance — or misinterpret dialect-heavy input altogether.

Rather than scaling model size, MBZUAI’s initiative focuses on building what researchers describe as “resource-lean” AI. The aim is to develop training frameworks that require less manually annotated data and lower computational power, reducing reliance on vast datasets and expensive infrastructure.

This approach carries practical implications. As leading AI models become larger and more costly to train and operate, only a handful of global companies have the resources to compete. By contrast, more efficient systems could enable universities, startups and public institutions in the region to build locally relevant tools without heavy capital requirements.

Google’s Regional AI Strategy

The funding aligns with Google.org’s broader push to support AI research in emerging markets and expand access to language technologies beyond English-dominant ecosystems.

Yossi Matias, Vice President at Google and Head of Google Research, said the collaboration reflects a shared objective to advance inclusive AI development.

“We are happy to collaborate with MBZUAI, which is deeply rooted in advancing AI research and fostering regional academic talent,” he said. “By focusing on low-resource languages in Large Language Models, we are progressing on the MENA AI Opportunity Initiative’s commitment to providing access to the most innovative AI technology in Arabic, its dialects and other languages spoken in the region. Funding this research aligns with our goal to accelerate scientific discovery through cooperation that delivers real-world impact.”

For Google, the investment is modest in financial terms but strategically aligned. As global AI competition intensifies, partnerships with regional institutions offer a pathway to deeper local integration and influence in emerging markets.

Building Talent Alongside Technology

Beyond technical research, the funding will support postdoctoral and early-career researchers at MBZUAI. The university, founded in 2019 in Abu Dhabi, has positioned itself as a regional hub for advanced AI research and graduate education.

The initiative aims to cultivate expertise in natural language processing tailored specifically to MENA’s linguistic diversity — an area historically underrepresented in global AI scholarship.

The anticipated applications extend beyond academic publishing. More inclusive AI systems could improve digital communication tools, educational platforms, speech technologies and cultural preservation efforts across the region.

If successful, the research would not simply improve translation accuracy. It would alter who AI works well for.

In a field where model size and computing power often dominate headlines, MBZUAI’s project shifts attention to a more structural question: whose language, whose dialect and whose lived experience are encoded into the systems shaping digital life.

The $1 million grant will not close the global data gap on its own. But it signals growing recognition that AI’s future — particularly in regions like MENA — may depend less on scaling Western-built systems and more on building from the ground up.

Previous
Previous

TII Launches UAE’s First Homegrown Hybrid Rocket in National Space Milestone

Next
Next

ZIWO Raises Strategic Growth Funding to Expand Arabic-First AI Platform Across the Gulf