How does Chat (language model) AI work?

Some people think (and not without reason) that artificial intelligence actually “understands” what we say. But in reality, it doesn’t really “understand” – the situation is a bit different. What we experience is actually just an illusion… but a very convincing one 😉.

Chat AI processes text (prompts) by breaking it down into small pieces called tokens (I already mentioned this in a previous post, since many paid AI services even charge based on token usage). A token can be a whole word, part of a word, or even punctuation.

For example:

  • “dog” → [“dog”]

  • “dogs” → [“dog”, “s”]

  • “Hello, world!” → [“Hello”, “,”, “world”, “!”]

When you ask a question, the AI “looks at” which patterns appeared frequently during its training, and based on that it predicts the most likely next token. This process repeats again and again until a complete answer is formed.

👉 Example:

  • Question: “What is the capital of Hungary?”

  • The AI works like this (very simplified): “After the phrase ‘capital of Hungary’, the most likely token is ‘Budapest’.”

  • Answer: “Budapest.”

So Chat AI doesn’t “know” things in a human way. Instead, it builds answers based on statistical patterns (imagine it like a huge library that even expands as people use it). And it does this so skillfully that it really feels like a conversation – sometimes so much that you can get lost in it, whether it’s a chat or even part of your workflow.

Still, despite the “lifelike” simulation, it’s only an appearance. In fact, some of the offline AI models I mentioned in earlier posts (with basic knowledge) can be just a few gigabytes in size and already run on your own computer if you have a good graphics card. So the technology itself isn’t impossibly complex – what’s truly impressive is the amount of data behind it.

More recently, AI has also started to take copyright into account: it almost always cites the source when quoting or extracting information. That’s why today, many people don’t even start with Google when they want to know something – they simply turn to an AI for help. 🙂

But about copyright… that’s a topic for another post 💡