
A man researches about Artificial Intelligence at a workstation in Kampala on December 19, 2023. PHOTO/FRANK BAGUMA
You know how everyone says that bigger is always better? In Artificial Intelligence (AI), that idea has become the standard.
But here’s the catch— new research data is starting to show that just making models larger isn’t always making them better. In fact, it’s a costly game that drains resources without the performance boost to match.
While the debate rages globally, Uganda’s growing tech ecosystem is feeling the pinch.
This race for bigger AI models isn’t just about tech—it's about who benefits and who gets left behind.
Many research papers that this reporter looked at show that while the biggest players reap the rewards, crucial areas like health, education, and climate solutions are sidelined.
This is intriguing because if AI is controlled by only a few, they’ll shape the future in their image, not ours.
AI is in overdrive—bigger models, richer data, and supercharged machines. The mantra? "Bigger is better."
Over the past decade, this scale obsession has birthed everything from Google Translate's uncanny language skills to social media feeds that read your mind and adverts that know your shoe size.
It began with AlexNet in 2012—a breakthrough that used graphics processing units (GPUs) and tons of data to prove bigger is smarter. By 2019, Richard Sutton’s “bitter lesson” nailed it: more computing makes better AI.
But is bigger always better?
One paper that this article focussed on called ‘Hype, Sustainability, and the Price of the Bigger-is-Better Paradigm in AI’ by three large language AI model researchers – Gael Varoquaux, Alexandra Sasha Luccioni and Meredith Whittaker argues, not necessarily and proceeds that the single-minded focus on scaling up might be hindering other avenues of innovation.
In Uganda, startups like Sunbird AI, MyMedikoz, and CodeBits are building impressive AI models with limited resources, showing that AI innovation doesn't require massive models.
But the "bigger-is-better" mindset has funneled billions of dollars into massive models, devouring computing power like a hungry hippo.
Generative AI now churns out lifelike images and stories, but it’s a game only the deep-pocketed can play. University researchers and startups? Left in the dust.
This isn’t just about research—it’s rewriting public policy and markets. Governments see big models as smarter and riskier, fueling regulations like the US AI Executive Order and the EU’s AI Act. Uganda is mulling one, expected early 2025.
The danger? As AI races toward ever-larger models, the small players—the ones who need AI most—are getting sidelined.
What problems does scale solve?
AI’s growth has been relentless—doubling in size every five months, gobbling up more data, parameters, and computing power like an insatiable teen.
In 2013, training a model was a one-day job on a gaming console. By 2020, it required the power of the world’s top supercomputers. The problem? Hardware can’t keep pace, and costs are still sky-high. Forget Moore’s Law—AI’s ramp-up is outpacing it.
But here’s the twist: more computing doesn’t always mean better results. Once a model hits a certain size, the returns start to dip.
Take tree-based models. They’re faster, cheaper, and less power-hungry for tasks like working with spreadsheets compared to complex Transformers.
In Uganda, where tech budgets are limited, efficient models are crucial because they don’t need an entire data center’s worth of resources.
Even in robotics, a simple ResNet18 can serve food in a restaurant with a bit of human help—no billion-parameter model needed.
The takeaway? Effective machine learning is about solving specific problems efficiently, not just building bigger models.
For instance, in areas like healthcare, interpretability and managing uncertainty are more important than sheer scale.
For Ugandan businesses, where budgets are tight, linear and tree-based models are often the smart choice: cost-effective and focused on the goal.
Unrepresentative benchmarks
AI progress is often measured using benchmarks—tests that rate how well models perform. For years, bigger models have dominated these benchmarks, setting the standard for “state-of-the-art” (SOTA).
But here’s the rub: does acing a benchmark mean real-world usefulness? Not always.
Benchmarks, created in controlled academic settings, often miss the complexity of real-life applications. They focus on narrow metrics like accuracy, overlooking critical qualities like robustness, adaptability, and efficiency—qualities that are essential when deploying AI in unpredictable environments.
In Uganda, where AI is just starting to make an impact, a model that excels in controlled tests but fails in practice looks out of use.
Another challenge? Data contamination.
Benchmark datasets can end up in training data, skewing results and eroding trust. Generative models, like large language models (LLMs), add to the confusion, as assessing creativity or coherence becomes an inconsistent game.
And new benchmarks for these traits? They’re hit or miss at best.
“While open leaderboards and evaluation libraries aim to improve rigor, they don’t address the core issue: benchmarks are just stand-ins, not definitive measures,” the trio of LLMs researchers argue.
Benchmarks don’t tell us if a model truly fits a specific purpose or context, such as using AI for health diagnostics in Uganda, where accuracy and interpretability are key.
The fixation on benchmarks also fuels exaggerated, unverifiable claims—like labeling a model as having “artificial general intelligence.”
These lofty goals can mislead, shifting resources away from practical solutions and towards unreachable dreams.
As a trio of AI researchers puts it: “Benchmarks have their place but are limited. Real progress means creating AI that excels not just in tests but in diverse, practical settings.”
For Ugandan researchers, high benchmarks are a distraction, often focusing on theoretical performance rather than the nuanced challenges faced on the ground, like local healthcare or communication issues.
When scale becomes a liability
AI’s obsession with bigger models is well-documented, but is it truly sustainable? New research suggests that not every AI challenge needs a heavyweight model, and more efficient approaches often get overlooked.
And yet scaling up comes at a high cost.
While technology has made computing more affordable, the demand for state-of-the-art (SOTA) models has surged, following the Jevons Paradox—greater efficiency often leads to higher consumption.
But this trend has real-world consequences. Larger models require more energy and hardware, creating challenges for nations with limited infrastructure. In Uganda, this could mean AI remains a luxury rather than a tool for practical growth.
For organizations—big or small, even global players like Booking.com—computing expenses are a major burden. And yet at a certain scale, bigger models don’t always translate to better performance; efficiency and robustness become more important, the researchers find.
One compelling example is Simon Mugisha, a 24-year-old recent graduate from Ernest Cook University in Uganda. Mr Mugisha is developing an AI model aimed at detecting cancer early, but he’s hitting a major roadblock: access to medical and patient data.
In an exclusive interview, he shared that while he’s reached out to public hospitals and institutions like the Uganda Cancer Institute, his efforts have been met with resistance. Without a data collection license, Mr Mugisha finds this process challenging and there are no policy measures in the country that could help improve data sharing and accessibility.
To keep going, he has turned to online, open-source data from research firms and health non-profits. But this solution is far from ideal. The data available is limited, sometimes restrictive, and rarely meets the scale needed for training a powerful AI model.
Mr Mugisha's work on early cancer detection highlights a significant issue facing innovators in Uganda: limited access to comprehensive, high-quality medical data. Without this data, breakthrough solutions are stifled, keeping progress in vital sectors like healthcare at bay.
The key takeaway? Scaling AI is more than a technical challenge; it’s a financial one. The high costs associated with running large models underscore the urgent need for a more sustainable approach.
For Uganda’s tech sector, this means focusing on efficient, practical solutions rather than simply chasing scale for scale’s sake.
The “bigger-is-better” mindset prioritizes expensive, resource-draining projects, giving an advantage to tech giants with deep pockets.
This leaves academic institutions and independent researchers—often working with limited funding—struggling to keep up.
The impact of this trend is already evident. Research by Besiroglu et al. (2024) indicates that academia’s role in foundational AI has been diminishing, overshadowed by computationally intensive projects.
While the number of AI-focused PhD graduates has increased (from 10.2 percent in 2010 to 19.1 percent in 2021, according to Stanford HAI, 2023), many of these experts are attracted to well-paying industry positions.
This shift has made academia more reliant on corporate funding, aligning research more closely with industry goals.
But this focus on scale risks sidelining alternative, innovative approaches to AI that challenge the status quo. It also limits the exploration of ethical AI development and diverse applications.
With a few dominant players steering the agenda, smaller research efforts face marginalization, and the AI field becomes narrower and more homogeneous in its focus.
The case for small-scale AI
AI has proven that it doesn’t need to hinge on massive models. Smaller, focused projects can still drive meaningful progress, offering insights without requiring huge resources.
How? Broadening AI research priorities can encourage inclusivity and sustainability.
Compact systems can tackle essential questions, like understanding uncertainty or causality, without needing vast datasets. And expanding benchmarks to assess such innovations can shift focus from sheer scale to meaningful progress.
“This fixation on scale has emerged via norms that shape how the scientific community acts. We believe that scientific understanding and meaningful social benefits of AI will come from de-emphasizing scale as a blanket solution for all problems, instead focusing on models that can be run on widely-available hardware, at moderate costs. This will enable more actors to shape how AI systems are created and used, providing more immediate value in applications ranging from health to business, as well as enabling a more democratic practice of AI,” the trio of the aforementioned researchers note.
While it's true that training AI models requires vast datasets for better performance, the cost and sustainability of large-scale data collection are growing concerns.
When approached for this article, Dr Ernest Mwebaze, a research scientist in Kampala specializing in AI and machine learning, shared insights on how researchers are finding innovative ways to tackle the challenge of limited data.
Dr Mwebaze highlights two approaches to navigate this challenge.
One method involves aggregating smaller datasets and prioritizing high-quality data for better results. The main hurdle here is accessing such quality data, which policymakers could help facilitate.
Another approach involves collecting field-specific data to provide context to a larger AI model. This means gathering smaller, focused datasets that are then integrated with broader, pre-existing models, such as those developed by major AI companies like Meta and Chat-GPT.
This hybrid approach can yield better results as it leverages specific, contextual data to enhance the general knowledge embedded in large-scale models.
Dr Mwebaze is putting this concept into practice in northern Uganda's refugee camps. His team is set to collect specific medical data from doctors on their interaction with refugees to develop a chatbot using several frequently asked questions from doctors and feeding this context into a large-scale AI model to address healthcare and communication challenges in these communities.
“It’s basically possible and it’s working. So we get this specific data, which is not as densely available as in places like Kampala and Mbale, and feed it into an AI model that already has general information. This creates better results and is more efficient than trying to collect data on everything,” he said on Tuesday.
In a country like Uganda, where resources are limited, shifting focus from massive models to efficient, context-specific AI solutions could drive real innovation—one that truly serves the needs of local communities.