TechEarl

How to Write an Effective System Prompt

Write an effective system prompt for an LLM with five parts: role, capabilities, constraints, output format, refusal policy. With before/after examples and the structure that maximises cache hits.

Ishan KarunaratneIshan Karunaratne⏱️ 8 min readUpdated
Write an effective system prompt with five parts: role, capabilities, constraints, output, refusal. Before/after examples and the structure that maximises cache hits.

A production system prompt has five parts in this order: role (who the model is), capabilities (what it can do), constraints (what it must not do), output format (the shape of the response), and refusal policy (what to say when the user asks for something outside the role). Each part has a job. Skipping any of them leaves the model to guess, and the guess is rarely what you want. I'll walk all five with before/after examples, then cover the structure that maximises prompt-cache hits.

The cliché advice "be specific" isn't actionable. Specific about what? The five-part structure is what to be specific about. It works across Claude, GPT, Gemini, and local models because it covers the same gaps every model has: it doesn't know who it is, what it's allowed to do, what it must avoid, what shape the answer takes, or what to say when asked to go off-script.

Jump to:

Part 1: Role

The role is one or two sentences telling the model who it is. Not "you are a helpful assistant" — that's the default and produces generic output. Something specific.

Bad: You are a helpful assistant.

Good: You are a senior backend engineer specialising in MySQL optimisation. You have 15 years of production experience and tend toward pragmatic over theoretical solutions.

The specifics shape every subsequent generation. A "senior backend engineer" writes different code than a "junior developer learning Rails". A "compliance officer" answers questions about user data differently than a "marketing copywriter". Pick the role that matches the actual job.

Part 2: Capabilities

List the things the model is supposed to be able to do. This sounds redundant — the model knows what it can do — but it's where you tell the model what to apply that capability to.

code
Capabilities:
- Review SQL queries for performance issues
- Suggest index changes
- Explain EXPLAIN output line by line
- Compare MySQL versions (5.7, 8.0, 8.4) when version-specific behaviour matters

Capabilities act as soft routing. When a user asks "should this be an INDEX or a UNIQUE INDEX?", the model knows it's allowed to give a strong opinion because "suggest index changes" is in the capability list. When they ask "rewrite my Python code", the model is more likely to redirect because Python isn't in the capability list.

Part 3: Constraints

The opposite of capabilities — what the model must not do. Constraints are where you encode the rules that matter for your product:

code
Constraints:
- Never generate SQL that drops a production table
- Never execute commands. Only suggest them, then let the user run them.
- Never assume the user is on a specific MySQL version unless they say so. Ask.
- Never write SQL longer than 50 lines without proposing a refactor first.

Constraints prevent the model from being helpful in ways that are dangerous. They also prevent specific kinds of unhelpful (the 50-line SQL one — without that constraint, you sometimes get walls of SQL that the user can't review).

Write constraints as imperatives ("Never X") not preferences ("Avoid X"). The imperative form is more reliably followed.

Part 4: Output format

The shape of the response. This is the part most prompts skip, and it's the part that has the biggest impact on whether the model output is usable in your app.

For structured-output use cases, this overlaps with schema-constrained JSON (covered in How to Get Reliable JSON from an LLM). For free-text use cases:

code
Output format:
- Start with a one-line summary in italics.
- Show the recommended query in a SQL code block.
- Add 2-4 bullet points explaining why this is better than the original.
- If you suggest an index, show the CREATE INDEX statement separately.
- Use Markdown, not HTML.

The result: every response has the same shape, which makes the UI rendering predictable and the user's mental model consistent.

Part 5: Refusal policy

What happens when the user asks for something outside the role's scope. Without this, the model either:

  • Tries to help anyway and produces low-quality output outside its area
  • Refuses awkwardly with generic "I can't help with that"
  • Goes off-topic for the rest of the conversation

A good refusal policy gives the model a specific way to redirect:

code
Refusal policy:
- If the user asks about a topic outside MySQL optimisation, briefly acknowledge it, then ask whether they want to refocus on the SQL question or end the session.
- If the user asks you to run a destructive command, decline and explain the risk in one sentence.
- If the user asks for legal or financial advice, redirect them to a qualified professional.

Concrete redirects beat generic refusals. The model now has a template for how to handle the case rather than improvising.

Full example: before and after

Before (typical first-draft system prompt):

code
You are a helpful AI assistant that helps users with their SQL queries.
Be polite and explain your reasoning.

After (five-part structured):

code
ROLE
You are a senior MySQL DBA with 15 years of production experience. You tend toward
pragmatic, indexable solutions over clever ones.

CAPABILITIES
- Review SQL queries for performance issues
- Suggest schema and index changes
- Explain EXPLAIN output line by line
- Identify when a query needs to become two queries
- Call out version-specific behaviour for MySQL 5.7, 8.0, and 8.4

CONSTRAINTS
- Never run commands; suggest them and let the user execute
- Never generate destructive SQL (DROP, TRUNCATE without WHERE) without an explicit
  confirmation step
- Never assume the MySQL version; ask if it's not in the conversation
- Keep individual SQL outputs under 50 lines; refactor or break into stages if longer

OUTPUT FORMAT
Start with a one-sentence summary. Then the recommended query in a SQL code block.
Then 2-4 bullets explaining why. If an index would help, show the CREATE INDEX in
a separate code block.

REFUSAL POLICY
- If the user asks about non-MySQL topics, acknowledge briefly and offer to refocus
- If the user asks to run a destructive operation, decline and explain the risk
- If the user is panicking about a production issue, give the safest fix first and
  the optimal fix second

The "after" version is longer, but every part has a job. The model output is dramatically more predictable. It's also cacheable — the entire prompt is stable across calls, so prompt caching makes it essentially free after the first invocation (covered in How to Cut LLM API Costs with Prompt Caching).

Structuring for prompt-cache hits

Put the system prompt first, in full. Put dynamic content (user message, retrieved RAG context, current timestamp) outside the system prompt, in the user message. This way the system prompt is a stable prefix that prompt caching can recognise.

javascript
{
  system: SYSTEM_PROMPT, // The whole 200-line stable block — cached
  messages: [
    { role: "user", content: `User question: ${userMessage}\n\nRelevant context: ${ragContext}` }
  ]
}

Not:

javascript
{
  system: `${SYSTEM_PROMPT}\n\nCurrent time: ${new Date()}`, // Dynamic — kills the cache
  messages: [{ role: "user", content: userMessage }]
}

The current time, request ID, user ID, anything that varies per call has to live in the user message, not the system prompt. Otherwise no two calls share a prefix and you pay full price every time.

What to do next

For the techniques that compound with a well-structured system prompt:

For the LLM-tier and cost decisions a good system prompt enables:

External reference: the Anthropic prompt engineering guide covers the patterns Anthropic recommends. OpenAI's prompt engineering documentation covers GPT-specific patterns.

FAQ

TagsLLMSystem PromptPrompt EngineeringAnthropicOpenAIAI
Share
Ishan Karunaratne

Ishan Karunaratne

Tech Architect · Software Engineer · AI/DevOps

Tech architect and software engineer with 20+ years across software, Linux systems, DevOps, and infrastructure — and a more recent focus on AI. Currently Chief Technology Officer at a tech startup in the healthcare space.

Keep reading

Related posts

Connect to an AWS EC2 instance using plain SSH with a key pair, EC2 Instance Connect, AWS Systems Manager Session Manager, or an EC2 Instance Connect Endpoint for private instances. Default usernames, security group rules, and troubleshooting Permission denied and Connection timed out.

How to SSH into an AWS EC2 Instance

Connect to an EC2 instance four ways: plain SSH with a key pair, EC2 Instance Connect, Session Manager, and EC2 Instance Connect Endpoint. Default usernames, security group rules, and the troubleshooting matrix that fixes Permission denied and Connection timed out.