🧠 AI, In Plain English  Β·  June 2026

I Built a Suite of AI Apps. Here's What Every AI Concept Actually Means.

A plain-English guide to temperature, tokens, context, and the other ideas behind every AI agent β€” explained through the apps I actually built with them.

⚑ Explore the Advanced AI Apps

Published: June 22, 2026  |  Category: AI Engineering Β· Concepts Explained

5+Apps Shipped
8Core Concepts
0PhDs Required
100%Plain English
The core concepts behind every AI agent, explained in plain English
πŸ’‘ Here's the thing nobody tells you: you don't need a maths degree to build with AI. You need to understand about eight ideas. I learned them the slow way β€” by building a study app, a legal advisor, a farming assistant, a content tool and a voice agent, and getting each of those eight ideas wrong at least once. Here they are, in plain English.

Over the last couple of years I've built and shipped a small pile of AI products. A suite of consumer apps β€” study help, a legal advisor, a content creator, a farming assistant in Hindi and Punjabi β€” plus a phone-based assistant for a company that manages tens of thousands of buildings, and a Slack bot that keeps my team in sync.

Along the way I kept bumping into the same handful of concepts. They have intimidating names and most explanations make them sound harder than they are. So here's my version: each idea, the story of the app that taught it to me, and what it actually means β€” no jargon, I promise.

1. Sometimes you want a robot. Sometimes you want a poet.

I built a feature in my legal app that pulls the exact section of a law and explains it. It has to be exact. No creativity, no "interpretation," no surprises β€” just the right section, the same way, every time.

Then I built a content tool that writes social posts and marketing copy. That one needs the opposite: personality, flair, word choices you didn't see coming.

Same underlying model. Opposite needs. And early on, the legal one kept "getting creative" with the wording while the content one kept writing safe, boring copy. The fix was one single dial: temperature.

🌑️

What temperature actually does

Low temperature β€” the robot

Predictable, repeatable, plays it safe. Perfect for quoting a law, pulling a number off an invoice, or anything where the same input should always give the same output.

High temperature β€” the poet

Varied, surprising, willing to take a swing. Perfect for marketing copy, brainstorming, captions β€” anywhere a fresh answer beats the obvious one.

So, what is temperature? It's a single setting that controls how safe versus creative an AI's answers are. Turn it down and the model becomes a reliable robot. Turn it up and it becomes a poet. Most people never touch it β€” and then wonder why their app feels either dull or unhinged.

The model didn't need to be smarter. It needed to know whether I wanted a robot or a poet that day.

2. The agent has the memory of a goldfish

My lecture app takes a 90-minute class and turns it into clean notes, a summary, and exam-style questions. The first time I tried it, I did the obvious thing: shove the entire transcript at the model and ask for notes.

It choked. Long recordings simply didn't fit, and when they almost fit, the model would "forget" the beginning of the lecture by the time it reached the end β€” like a student who slept through the first hour.

So, what is a context window? It's the model's short-term memory β€” the total amount of text it can hold in its head at one time. Go past it and the earliest stuff falls out the back. The real skill isn't dumping everything in; it's deciding what the model actually needs to see for the question at hand. For a long lecture that meant breaking it into chunks, summarising each, then summarising the summaries.

An AI doesn't read your whole document. It reads as much as fits in its memory β€” your job is to choose which part that is.

3. You're not paying per question. You're paying per word.

The first month one of my consumer apps got popular, I felt great β€” right up until I saw the bill. AI isn't like normal software, where one more user costs almost nothing. Every single answer costs real money, and the meter runs on tokens.

So, what is a token? Roughly, a chunk of a word β€” the unit a model reads and writes in. "Hello" is one token; a long word might be two or three. And here's the part that bites you: you pay for everything you send in (the user's question, the document, your instructions) and everything the model sends back. A chatty assistant answering a long question is quietly expensive.

This is why my apps run on yearly plans with sensible limits, and why I instrument cost per request from day one. Going "viral" while every free user racks up an unbounded bill isn't a success story β€” it's an invoice you can't pay.

πŸ’°

The mental model that saved me money

What I assumed

"One question = one fixed cost." So I let prompts and answers sprawl, padded them with extra instructions, and never measured.

What's actually true

You pay for every word in and every word out. Tighter prompts, shorter answers and hard limits aren't stinginess β€” they're the business model.

4. Don't make it memorise. Give it the textbook.

The hardest thing I've built is a voice assistant that takes work orders over the phone. A caller half-remembers a building's name on a noisy line, and the agent has to pin down the right one out of tens of thousands. My first version asked the model to just "know" the answer. It confidently picked the wrong building constantly.

The fix wasn't a smarter model. It was giving the model the textbook. Before it answers, I search a proper index for the buildings that actually match, hand the model just those few, and ask it to choose from them. Accuracy shot up overnight.

So, what is RAG? It's an ugly acronym (retrieval-augmented generation) for a simple idea: look it up first, then answer. Instead of trusting the model's memory, you fetch the relevant facts from a source you trust and feed them in. It's the difference between an open-book exam and a from-memory one β€” and it's the single biggest weapon against made-up answers.

A model without retrieval is a confident liar. A model with good retrieval barely has to be clever at all.

5. The job description you write before anyone says a word

The same model powers my farming assistant and my legal advisor, but they behave like completely different people. The farming one talks like a helpful neighbour, in simple Hindi or Punjabi, patient with someone who's never used an app. The legal one is precise, careful, and never guesses about the law.

I didn't fine-tune two different models for that. I just wrote two different system prompts.

So, what is a system prompt? It's the standing instruction you give the model before the user ever speaks β€” its job description, its tone, its rules, the things it must never do. Get it right and the same engine becomes a calm teacher, a cautious lawyer, or a punchy copywriter. Get it vague and you get a generic chatbot that sounds like every other one.

You're not just choosing a model. You're hiring an employee and writing their first-day briefing.

6. When you want an answer, not an essay

I built a tool that reads a messy PDF invoice and pulls out the numbers β€” vendor, date, total, line items. The trouble is, models love to chat. Ask for the total and you'd get "Sure! The total on this invoice appears to be β‚Ή12,400, though you may want to double-check…" β€” lovely for a human, useless for code that just needs a number.

So, what is structured output? It's telling the model to reply in a strict, machine-readable shape β€” usually JSON, basically a tidy labelled box β€” instead of prose. "Give me only this: vendor, date, total, items." Now my code gets clean fields it can drop straight into a database, with no fishing the number out of a paragraph. The moment an AI feature has to talk to the rest of your software, this is the concept that makes it reliable.

7. Teaching it to press buttons, not just talk

My team's Slack bot can answer "what did everyone ship last week?" That sounds like a chat question, but the model can't answer it from thin air β€” the information lives in a database, not in its training. So I let it use tools: when it sees that question, it doesn't make something up, it calls a function that actually queries our data, gets real numbers back, and then explains them.

So, what is function calling (or "tool use")? It's giving the model a set of buttons it's allowed to press β€” look up the weather, search a database, send a message β€” and letting it decide when to press them. This is the leap from a chatbot that talks to an agent that does things. My farming app uses it to fetch live weather; the Slack bot uses it to pull real activity. The model stays the brain; the tools are its hands.

A chatbot answers questions. An agent answers questions by going and finding out.

8. It will lie to you with a completely straight face

There's a specific kind of dread in watching an AI tell a paying user something that's flatly wrong β€” with total confidence. In a legal app, that's not just embarrassing, it's dangerous. Early on I treated this as a "later" problem. By the time I noticed I needed safeguards, the agent had already said things it shouldn't have.

So, what is hallucination? A model is built to produce text that sounds right, not to know when it's actually wrong. So sometimes it just… makes things up, fluently. There's no built-in alarm bell.

The answer isn't a magic setting β€” it's guardrails you build yourself: give it real sources to quote from (back to retrieval), check what it's allowed to claim, and let it say "I'm not sure β€” let me get you a human" instead of inventing an answer. Your users can't tell confidence from correctness. Your guardrails have to.

The whole thing in one breath

If you remember nothing else: temperature sets robot-vs-poet. Tokens are what you pay for, in and out. The context window is its short-term memory. RAG means look it up before you answer. The system prompt is its job description. Structured output is for when code needs an answer, not an essay. Function calling turns a talker into a doer. And hallucination is why you build guardrails before you need them.

None of this requires a PhD. It requires building something real, watching it fail in an interesting new way, and learning the name of the thing that just bit you. Eight ideas. That's the whole vocabulary behind every AI agent I've shipped.

You don't learn AI by reading about it. You learn it by shipping something and finding out which of these eight ideas you got wrong.

Frequently Asked Questions

What does temperature mean in an AI model?
Temperature is a single setting that controls how safe versus creative an AI's answers are. Low temperature makes it predictable and repeatable β€” good for exact tasks like quoting a law or pulling a number off an invoice. High temperature makes it varied and surprising β€” good for writing marketing copy or brainstorming. Same model, opposite behaviour, one dial.
What is a token and why does it cost money?
A token is roughly a chunk of a word β€” the unit an AI model reads and writes in. You don't pay per question; you pay per token, for everything you send in and everything it sends back. Long documents and chatty answers cost more, which is why apps that scale have to plan limits and tiers before they grow, not after.
What is a context window?
The context window is the model's short-term memory β€” the total amount of text it can hold in mind at once. Anything beyond it gets forgotten. For a 90-minute lecture or a long document, you can't just dump everything in; you have to decide what the model actually sees, which is its own design problem.
What is RAG (retrieval-augmented generation)?
RAG means giving the model the right reference material to answer from, instead of hoping it memorised the answer. You search a trusted source first, hand the model only the relevant passages, and ask it to answer from those. It's the difference between an open-book exam and one from memory β€” and it dramatically reduces made-up answers.
What is a system prompt?
A system prompt is the standing instruction you give the model before the user ever speaks β€” its job description, tone, rules and boundaries. A farming assistant gets told to talk like a helpful neighbour in simple Hindi; a legal assistant gets told to be precise and cautious. Same engine, completely different employee.
Why do AI models hallucinate, and what can you do about it?
Models are built to produce plausible-sounding text, not to know when they're wrong, so they will state false things with total confidence. You manage it with guardrails: give the model real sources to quote from, validate what it's allowed to claim, and let it say "I don't know" instead of inventing an answer. Users can't tell confidence from correctness, so your checks have to.
⚑

Built by people who ship AI, not just talk about it

Every concept here came from a real app in real users' hands β€” a study tool, a legal advisor, a farming assistant, a content creator and more, all powered by the same eight ideas. Take a look at what we've built.

⚑ Explore Advanced AI Apps
Note: This is a first-person, plain-English explainer based on real experience building AI products. Technical concepts are simplified on purpose, and specifics are generalized where client confidentiality applies. Questions or corrections? support@advancedaiapps.com