Chat AI – A Historical Overview

Before diving into the good stuff 🍅, I’d like to briefly write about the origins and history of language model AI.

Let’s start with the basics. People have been experimenting with chat-based AI for a long time. The very first chat-based AI, ELIZA, already existed in 1966. But it wasn’t a real breakthrough, since these early attempts were still purely rule-based — more like a chess program or a dictionary. There was no learning or text comprehension. If you used slang, it couldn’t understand you at all.

Many early computer games also tried to use this idea. Some of you might remember the C64 text adventure games: if you typed “Go left,” the hero went left. But if you typed “Dash left!” 😀 the program didn’t get it… These systems were still very primitive and only reacted to fixed, predefined commands.

After that, chat-based AI somewhat declined. Maybe people got frustrated with its limitations, or maybe there were other reasons, but for a while the whole AI craze faded into the background. Text adventure games were gradually replaced by point-and-click adventure games.

The real turning point came with machine learning and deep learning, which really started to take off after 2010. Chat-based AI only made its true breakthrough once huge amounts of data became available (internet texts, social media, etc.) and powerful hardware emerged — especially advanced GPUs. Instead of processors, GPUs became the real workhorses (even today, if you want to run an offline chat-based AI, you don’t just need a good CPU, you need one or more very powerful graphics cards).

Later, researchers developed neural network–based models, and especially those based on the Transformer architecture (e.g. BERT, GPT, etc.). These models could actually “learn” language in a statistical sense, instead of just simulating templated responses like ELIZA or those old C64 games. That’s when things started heading in a truly exciting direction…

Today, we’ve reached a point where in the Turing Test, it’s very difficult — or even impossible — to tell whether the response comes from a human or not. Even if you write in slang, the AI can handle it. That said, I still think these systems make plenty of mistakes and are in their infancy compared to what they’ll eventually be able to achieve…

I think that’s enough for a short historical overview. Now let’s move on to the more interesting stuff. :)