Nvidia Researchers Argue That SLMs Represent the Future of AI: Here’s Why
Nvidia experts claim that Small Language Models (SLMs) are essential for the future of the artificial intelligence (AI) sector.
Nonetheless, most investments are still directed toward Large Language Models (LLMs). Should this trend persist, the industry may face stagnation, potentially affecting the U.S. economy.
Summary
A significant number of AI investors are focused on companies that create LLM-driven products.
SLM agents tend to be more cost-effective and efficient for specific tasks when compared to LLMs.
Nvidia emphasizes that SLMs hold the key to the future of AI, encouraging organizations to embrace smaller models.
SLMs vs. LLMs
SLMs are trained on up to 40 billion parameters, excelling in a limited range of defined tasks while requiring far fewer resources. In other words, they are more budget-friendly.
On the other hand, LLMs come with a hefty price tag. In April, OpenAI CEO Sam Altman noted that his company’s flagship product, ChatGPT, incurs costs in the tens of millions when users say “please” and “thank you.” This highlights the steep expenses tied to LLMs. SLMs shine in this aspect since they do not necessitate expensive data centers for task execution.
For instance, SLMs can serve as customer support chatbots without needing extensive knowledge spanning multiple subjects.
A Nvidia research paper released in June asserts that SLM agents represent the future of AI over LLM agents:
“…small language models (SLMs) are sufficiently powerful, inherently more suitable, and necessarily more economical for many invocations in agentic systems, and are therefore the future of agentic AI.”
Moreover, LLMs assist in the training of SLMs, enabling them to learn effectively from larger models without starting from scratch, ultimately making them nearly as effective in specific tasks while conserving resources.
The smallest language models, trained on one billion parameters, can function using standard CPUs.
Companies don’t require virtual humans with exhaustive knowledge; they need tools capable of addressing specific tasks swiftly and accurately.
This is why low-cost SLM agents are far more appealing investments compared to LLMs. Notably, GPT-5 utilizes a variety of models, including smaller ones, for designated tasks.
What happens if the AI sector faces a setback?
Crypto and blockchain companies are increasingly adopting LLMs to enhance their operations and decision-making processes. DeFi platforms like Zignaly use LLMs for trade summaries and investment insights, while infrastructure entities like Platonic and Network3 apply them to assist developers and optimize on-chain workflows.
Trading firms are also combining LLMs with other AI tools for market intelligence and predictive analytics.
However, significant projects include Google’s Gemini, OpenAI’s GPT, Anthropic’s Claude, and xAI’s Grok. Each of these initiatives necessitates extensive data centers, resulting in substantial electricity use and capital expenditure.
The AI sector in the U.S. secured $109 billion in investments in 2024 alone. This year, American AI companies have already spent $400 billion on infrastructure. Reports from August indicated that OpenAI is considering selling $500 billion worth of its stock. According to Morgan Stanley’s Andrew Sheets, AI firms might allocate $3 trillion towards data centers by 2029.
Additionally, IDC Research predicts that by 2030, every dollar invested in AI-driven business solutions will yield $4.6 for the global economy.
Nonetheless, concerns persist. If adequate data centers are not established, it could significantly impact the economy and deter key investors. A reduction in investment allocations toward AI firms could lead to a decline in spending.
The slowdown in AI firms leveraging LLMs may stem from factors such as unstable electricity supplies, high interest rates, trade disputes, and growing demand for SLMs, among others.
Some even argue that inflating the data centers could lead to a bubble, lacking the optimistic allure of the dotcom era that spurred the Internet revolution. The problem with data centers lies in their dependence on chips that will inevitably become outdated.
This obsolescence might occur in just a few years. Thus, even though these chips are costly, they will not be adaptable for other uses.
How to avoid collapse
To prevent a potential collapse, Nvidia researchers advise AI companies to concentrate on using SLMs and enhancing the specialization of SLM agents.
This approach will help conserve resources while boosting efficiency and competitiveness.
Researchers propose developing modular agent systems to maintain flexibility and to utilize LLMs solely for complex reasoning tasks.