The AI landscape is ever changing.

While companies such as Google, xAI, and OpenAI get the lion's share of the attention, there is a new wave of innovation occurring that is going to provide AI access to millions of people.

This is the future of smaller language models.

One of the interesting components to Ai is that, for all the talk of centralization, we are actually seeing the reverse happening. Innovation is occurring at many different levels. This means that, as smaller models start to be used, data is now spread out.

Of course, the bigger entities will still have the advantage. When it comes to cutting edge capabilities, Big Tech will still dominate. It is the leader in both data and compute. Algorithms are the one area where smaller entities can catch up although the big boys are adept here also.

The advantages for smaller models to the average users are many.

In this article I will cite how things are going to change and why.

Smaller Language Models Will Be The Norm

Large language models are a catchall.

The best way to think of them is as big and bulky. They are designed to capture as much as possible, utilizing input for further expansion. Specialization is not the focus. The goal is to be general, appealing to the largest class of people.

Many are impressed by the capabilities across a wide range of disciplines. They are improving overall but there are many areas where they lack. Simply put, they do not have the data focus on many niche areas.

Here is where smaller models enter. They can not only add more specific data but the feedback loop becomes more personalized.

What do I mean by this?

If we look at the SLM as an entity, the users create a "personalized" model as compared to the larger counterparts. These end up being trained on input from a much smaller user base. Therefore, their likes and interest carry a great deal more weight.

Of course, any interaction with the model generates data that is not available to the larger companies unless it is posted online. It is at this point we start to see divergence.

Efficiency And Customization

The costs of larger models, with regards to training keeps increasing. This is something that we see across each generation. No company is exempt.

I would say the percentage that need the latest cutting edge capabilities are few. Honestly, a miniscule percentage of the population uses an AI model anywhere above the basic capabilities.

For example, how many have used a model for its full-on, reasoning ability? Perhaps those who are involved in deep research. Most have conversations about basic things.

This means that not only can smaller models be trained for a fraction of the cost, they also can run more efficiently. This will save money. Token prices are dropping yet the number of tokens used only increases.

We also see the ability to customize. Certainly the larger models can do this. However, if we look at OpenAI, the interaction is based upon that with ChatGPT. There is no other engagement.

Smaller models can be centered around where the users engage. In a company setting, this can be internal databases. Whatever the vehicle, the data between users is incorporated into the newer versions of the model. When tied to a vector database, we see how real time data instantly becomes available.

xAI is doing this with Grok on the X platform. Google is obviously implementing this across its different applications. Meta has a number of apps which billions utilize each month.

Smaller models can do the same thing. Something like Rafiki can use the power of models to provide a robust service with as few as a couple hundred users.

Over the next few years, we are going to see this fanning out in many different directions. While the talk is about the LLMs, it is worth watching the smaller versions.