Software that uses an LLM to plan and execute multi-step tasks — calling tools, reading data, making decisions — with limited human input. Different from a chatbot, which mostly just responds.
A sequence of steps where an AI agent decides what to do next based on results so far — rather than running a fixed script. Common in sales prospecting, research and triage tasks.
The underlying AI that powers tools like ChatGPT, Claude and Gemini — trained on vast text data to understand and generate language. The brain inside an agent.
The discipline of crafting clear, specific instructions for an LLM to get reliable, high-quality output. The single highest-leverage human skill in modern AI workflows.
The standing instruction that defines how an LLM should behave across an entire conversation — its role, tone, constraints and goals. Set once, applies always.
Showing the LLM two to five examples of the kind of output you want, so it can pattern-match rather than guess. Hugely effective for consistent formatting.
A prompting technique that asks the LLM to think step-by-step before answering. Improves accuracy on complex reasoning tasks dramatically.
RAG
Retrieval-Augmented Generation
An architecture where the LLM looks up relevant documents from a knowledge base before answering — so it grounds responses in your data, not just its training set.
Further training a base LLM on your own data so it specialises in your domain, voice or task. Powerful but increasingly replaced by good prompting + RAG.
When an LLM generates plausible-sounding but factually wrong output. The reason every AI workflow needs human checkpoints on anything high-stakes.
The maximum amount of text an LLM can consider at once — measured in tokens. Modern models handle 100K to 1M+ tokens, enough for whole books or codebases.
The unit of text an LLM processes — roughly 0.75 words in English. Pricing, context limits and rate limits are all measured in tokens.
A mathematical vector representing the meaning of a piece of text. Lets you find semantically similar content ("customer churn" matches "users leaving") instead of just keyword matches.
A database optimised for storing and searching embeddings — the backbone of most RAG systems. Examples: Pinecone, Weaviate, pgvector.
MCP
Model Context Protocol
An open standard for connecting AI models to external tools and data sources in a consistent way. Lets one agent talk to your CRM, calendar, files and inbox without bespoke integrations.
An architecture where multiple specialised agents work together — one researcher, one writer, one reviewer — each with its own role and prompt. Often produces better results than one large agent.
Tool Use / Function Calling
An LLM's ability to call external functions (search the web, query a database, send an email) and use the results in its reasoning. The capability that turns models into agents.
The act of running a trained model to get an output. Every time you ask ChatGPT a question, you're paying for inference.
An agent design pattern where humans review or approve AI outputs at defined checkpoints — before sending an email, before updating a record, before making an irreversible call.