Insight · March 26, 2026 · 19 min read

30 Claude Tips

30 Claude Tips That Actually Change How You Prompt

A practical reference for getting better, faster, cheaper results from Claude. Every tip here comes from Anthropic's official documentation, their engineering blog, or battle-tested production workflows. No theory without application. Each tip includes a concrete example you can use immediately.

1. Prefill the assistant turn

Start Claude's reply for it. This forces a specific format, kills the preamble, and steers the output from the very first token.

In the API, you add a partial assistant message after your user message. Claude continues exactly where you left off.

Without prefill:

User: Classify this support ticket as billing, technical, or account. "I can't log into my dashboard since yesterday."

Claude responds: "Sure! I'd be happy to help classify this ticket. Based on the content..." (preamble before the answer)

With prefill (add a partial assistant message):

Assistant (prefilled): {"category":

Claude continues: "technical", "confidence": 0.95, "reason": "Login access issue"}

The prefill steers Claude straight into the JSON structure. No preamble. No explanation. Just the output you need.

2. Put the most important instruction last

Claude pays more attention to what appears at the end of a long prompt. This is called recency bias. If you have a critical formatting rule or constraint, place it after your reference material, not before.

Weaker (instruction buried at the top):

You must respond in JSON only. No markdown.

\[5,000-word product requirements document\]

What are the three highest-priority features?

Stronger (instruction at the end):

\[5,000-word product requirements document\]

What are the three highest-priority features? Critical: Respond in JSON only. No markdown. No preamble.

Anthropic's own documentation confirms that queries placed at the end of long-context prompts improve response quality by up to 30%, especially with complex, multi-document inputs.

3. Use XML tags to separate content from instructions

XML tags create semantic boundaries that Claude interprets structurally. They prevent the model from confusing your reference data with your directions, and they make complex prompts dramatically more reliable.

Without tags (ambiguous):

Summarize the following. The quarterly revenue was $4.2M, up 23% YoY. Customer churn was 3.2%. Focus on financial risks only.

Claude might not know where the data ends and the instruction begins.

With tags (clear):

<instructions>
Summarize the document below. Focus exclusively on financial risks.
Output 2-3 sentences.
</instructions>

<document>
The quarterly revenue was $4.2M, up 23% YoY. Customer churn was 3.2%...
</document>

Claude now knows exactly what is data and what is direction. This becomes essential when you pass in multiple documents, user inputs, or few-shot examples.

4. "Think step by step" in the system prompt, not the user turn

Placing reasoning instructions in the system prompt ensures consistent reasoning across every API call, not just when you remember to ask for it.

Inconsistent (in the user message):

User: Analyze this error log and find the root cause. Think step by step.

This only works when you remember to add it.

Consistent (in the system prompt):

System: You are a senior backend engineer. When diagnosing issues, always reason step by step before giving your conclusion. Show your reasoning process.

Now every message to this system benefits from structured reasoning without repeating the instruction.

Note: if you have access to Claude's extended thinking feature, that is generally preferable to manual chain-of-thought prompting. But for the free plan or situations where you need visible, reviewable reasoning, the system prompt approach still works.

5. Multishot beats description for format

Showing Claude an example input-output pair is more reliable than describing the format you want. One well-crafted example anchors style, structure, and tone better than a paragraph of instructions.

Description only (fragile):

Return a JSON object with fields: name (string), severity (1-5 integer), and tags (array of strings).

With example (robust):

Classify the following bug report. Here is an example of the expected format:

Input: "The app crashes when I click the export button on Safari." Output: {"name": "Safari export crash", "severity": 3, "tags": ["browser-specific", "export", "crash"]}

Now classify this: Input: "Dashboard charts don't load on mobile after the latest update."

Start with one example. Only add more if the output still drifts.

6. Give Claude permission to say "I don't know"

Explicitly telling Claude that uncertainty is acceptable drastically reduces hallucinations. Without this, it fills gaps with confident-sounding fabrications.

Without permission:

What was the exact revenue figure for Acme Corp in Q3 2025?

Claude might generate a plausible-sounding number even if it has no data.

With permission (add to your system prompt):

If you are unsure about a fact or do not have enough information to answer accurately, say so explicitly. Do not guess or fabricate data. It is always better to say "I don't have enough information to answer this" than to provide an inaccurate response.

Anthropic's own documentation lists this as the number one technique for reducing hallucinations.

7. Tell Claude what to do, not what not to do

"Respond in flowing prose paragraphs" works better than "Don't use markdown." Claude 4.x follows positive instructions more reliably than negations.

Weaker (negation):

Do not use bullet points. Do not use headers. Do not use bold text.

Stronger (positive direction):

Write your response as smoothly flowing prose paragraphs. Use natural transitions between ideas. Keep formatting minimal and conversational.

You can also reinforce this with XML-style format indicators: "Write the prose sections of your response inside <flowing_prose> tags." Claude tends to match the formatting style of whatever structure you present. If your prompt is full of markdown, the output will be too.

8. Put long documents above your instructions

For prompts over 20K tokens, placing reference material first and your query at the end can improve response quality by up to 30%. This applies across all Claude models.

Weaker structure:

Summarize the key risks in this contract. Focus on liability clauses.

\[50-page contract\]

Stronger structure:

\[50-page contract\]

Based on the document above, summarize the key risks. Focus on liability clauses and termination conditions. Output 3-5 bullet points.

The model processes the document, builds up a representation, and then encounters your specific question while the full context is fresh. Anthropic calls this "put longform data at the top."

9. Explain why, not just what

Adding motivation behind an instruction helps Claude generalize better than bare commands. Claude is smart enough to extrapolate from the reasoning.

Without explanation:

Keep your response under 100 words.

With explanation:

Keep your response under 100 words because this will be displayed as a tooltip in our mobile app, and longer text gets truncated on small screens.

Now Claude understands the constraint. If it needs to decide what to cut, it prioritizes information that works in a mobile tooltip context. The "why" gives it a decision-making framework, not just a rule.

10. Prompt caching drops costs up to 90%

If your system prompt or reference docs stay the same across API calls, cache them. Cached reads cost just 10% of normal input token pricing, and latency drops by up to 85%.

How it works: Mark the static portion of your prompt with a cache_control parameter. On the first call, Claude processes and caches the static part. On subsequent calls, it reads from cache.

Good candidates for caching:

A system prompt with detailed instructions. A 50-page product manual your chatbot references on every call. A set of 20 few-shot examples. Tool definitions for an agent.

Not cacheable:

Content that changes every call (the user's message). Prompts shorter than the minimum threshold (1,024 tokens for most models, 4,096 for Haiku).

The key rule: static content goes first (tools, then system prompt, then reference docs), dynamic content goes last.

11. Extended thinking for hard problems, not easy ones

Extended thinking dramatically improves complex reasoning (math, multi-step logic, architecture planning) but adds latency and cost with no benefit on simple queries.

Use extended thinking for: Complex multi-step math. Code debugging across multiple files. Architectural planning with tradeoffs. Logic puzzles with many constraints.

Skip extended thinking for: Simple classification. Straightforward text generation. FAQ-style question answering. Basic summarization.

Start with the minimum thinking budget (1,024 tokens) and increase incrementally. Diminishing returns are real, and the optimal budget depends on the task.

12. Chain prompts instead of stuffing one

Break complex tasks into sequential API calls. Each step is simpler and more accurate than one mega-prompt.

Single prompt (fragile):

Analyze this medical paper, summarize the methodology, evaluate the statistical rigor, identify clinical implications, and format everything as a peer review report.

Chained prompts (robust):

Call 1: "Summarize this medical paper covering methodology, findings, and clinical implications."
**Call 2: "Review the summary above for accuracy, clarity, and completeness. Provide graded feedback."
**Call 3: "Improve the summary based on this feedback: \[output from call 2\]"

Tradeoff: higher latency (multiple API calls), but accuracy and reliability improve dramatically.

13. Match your prompt's formatting to the output you want

Claude mirrors the formatting style of your prompt. If your prompt is full of markdown headers, the output will be too. Want clean prose? Write your prompt in clean prose.

Markdown-heavy prompt produces markdown-heavy output:

## Task Description
**Objective:** Analyze the data
**Requirements:**
- Point 1
- Point 2

Clean prose prompt produces clean prose output:

Analyze the attached dataset and write a narrative summary of the key trends. Use flowing paragraphs. This will be read aloud in a presentation, so make it sound natural when spoken.

This works both directions. If you want a highly structured output, structure your prompt to match.

14. One example is usually enough

Start with one-shot prompting. Only add more if the output still drifts.

One-shot (start here):

Extract action items from meeting notes. Example:

Input: "We agreed to ship the new onboarding flow by March. Sarah will handle the design. Mike needs to update the API docs."

Output:

Ship new onboarding flow (deadline: March)
Sarah: handle design
Mike: update API docs

Now extract action items from this: \[your meeting notes\]

If the format is wrong after one example, add a second that covers the specific edge case. But resist adding five examples upfront. Each one burns tokens and can introduce conflicting patterns.

15. Ask Claude to quote before it answers

For long-document tasks (20K+ tokens), telling Claude to extract relevant quotes first grounds its response in the actual text and significantly reduces hallucination.

Without quoting (higher hallucination risk):

Based on the contract above, what are the termination conditions?

With quoting (grounded):

First, extract the exact quotes from the contract that describe termination conditions. Include the section numbers. Then, based only on those quotes, summarize the termination conditions in plain language.

This forces Claude to locate specific evidence before synthesizing. If the document doesn't mention something, Claude can't fabricate a quote for it, which means it can't fabricate the answer.

16. Avoid the word "think" when extended thinking is off

Claude 4.x models are sensitive to "think" and its variants when extended thinking is not enabled. This can trigger unintended behavior.

Potentially problematic:

Think about the pros and cons of this architecture.

Safer alternatives:

Consider the pros and cons of this architecture.

Evaluate the tradeoffs of this architecture.

Assess the strengths and weaknesses of this approach.

Use "consider," "evaluate," "assess," or "weigh" instead.

17. System prompts set identity. User messages set tasks.

The system prompt is for persistent behavior: tone, constraints, role. Per-task instructions go in the user message. Mixing them causes drift.

System prompt (who Claude is across all calls):

You are a senior tax accountant. You give precise, conservative advice based on current US tax law. When uncertain, flag it as something the user should verify with a licensed CPA. You never speculate about audit risk. You respond in clear, jargon-free language.

User message (what Claude does right now):

My LLC received a 1099-NEC for $85,000 in consulting income. I also have $12,000 in home office expenses. Walk me through how to report this on my tax return.

The system prompt stays constant. The user message changes every time.

18. Wrap each document in its own XML tags with metadata

When passing multiple documents, use structured XML with source metadata so Claude can distinguish and cite them properly.

Without structure (ambiguous):

Here are some documents. \[dump everything together\]

With structure (precise):

<documents>
  <document index="1">
    <source>Q4_2025_financial_report.pdf</source>
    <document_content>
    [financial report text]
    </document_content>
  </document>
  <document index="2">
    <source>customer_survey_results.csv</source>
    <document_content>
    [survey data]
    </document_content>
  </document>
</documents>

Based on the documents above, which one mentions customer churn
and what is the reported rate?

Now Claude can reference "document 1" or "the financial report" unambiguously.

19. Claude 4.x is literal. Ask for what you actually want.

Older Claude models inferred intent and went above and beyond. Current models do exactly what you ask, nothing more.

Vague prompt, minimal output:

Create an analytics dashboard.

Claude might return a basic frame with a title and one chart.

Explicit prompt, full output:

Create an analytics dashboard. Include filters, date range selectors, multiple chart types, summary statistics, responsive layout, and export options. Go beyond the basics to create a fully-featured implementation.

Claude 4.x is designed for precise instruction following. "Build a dashboard" gets you a dashboard. "Build a comprehensive, production-quality dashboard with these specific features" gets you that instead.

20. Use structured output mode for reliable JSON

If you need valid JSON, use the API's structured output or tool-use mode rather than hoping the model formats correctly in free text.

Fragile (free-text JSON):

Return your analysis as a JSON object with fields: risk_level, summary, and recommendations.

Claude might add markdown code fences, include a preamble, or produce invalid JSON.

Robust (tool use / structured output):

Define a tool with the exact JSON schema you need. Claude generates the structured output as a tool call, guaranteed to match the schema.

For simpler cases, combining a JSON instruction with assistant prefill (starting the response with {) is a good middle-ground.

21. Run the same prompt multiple times to catch hallucinations

Best-of-N verification: if Claude gives different answers across multiple runs (at temperature > 0), at least one is likely wrong. Consistency signals reliability.

How to use this:

Run your prompt 3-5 times. If the core factual claims match across all runs, confidence is high. If a specific claim appears in only 1 or 2 runs, treat it as suspect.

This is especially useful for factual extraction from documents, code generation (does the same logic appear each time?), and classification tasks (is the label stable?).

For production systems, automate it: run N calls, compare outputs, flag inconsistencies for human review.

22. Constrain the output length explicitly

"Give me a summary" produces 500 words. "Summarize in exactly 3 sentences, each under 20 words" produces exactly that.

Vague:

Summarize this article.

Specific:

Summarize this article in exactly 3 sentences. Each sentence should be under 25 words. The first sentence covers the main finding. The second covers the methodology. The third covers the limitation.

You can constrain by sentence count, word count, paragraph count, or character count. The more specific the constraint, the more closely Claude follows it.

23. Do not over-engineer the prompt

Longer is not always better. Start simple. Test. Add complexity only when you see a specific failure.

Over-engineered:

You are an expert technical writer with 20 years of experience in API documentation. You have deep expertise in REST APIs, GraphQL, and gRPC. You value clarity above all else. Always use the active voice. Never use jargon without defining it. Structure your response with headers. Use code examples for every endpoint. Follow the Microsoft Style Guide...

\[30 more lines of instructions\]

Write documentation for this endpoint: GET /users/:id

Right-sized:

Write API documentation for this endpoint: GET /users/:id

Include: description, parameters, example request, example response, and error codes. Use clear, active-voice language. Assume the reader is a junior developer.

The second version produces output that is just as good, uses far fewer tokens, costs less, and is easier to maintain.

24. Prompt caching has a minimum token threshold

You need at least 1,024 tokens of cacheable content for caching to work on most Claude models (2,048 for Sonnet, 4,096 for Haiku). Small system prompts don't qualify.

Will not cache (too small):

System prompt: "You are a helpful assistant." (7 tokens)

Will cache (large enough):

System prompt with detailed instructions, persona definition, formatting rules, few-shot examples, and reference material totaling 2,000+ tokens.

If your system prompt is short, prepend reference documents or few-shot examples to push it over the threshold.

25. Use the system prompt to reduce tool over-triggering

Claude 4.x is more responsive to system prompt instructions than previous versions. If Claude calls tools too aggressively, soften the language.

Over-triggering (too aggressive):

CRITICAL: You MUST use the search tool for EVERY factual question. ALWAYS search before responding.

Balanced:

Use the search tool when you need current information or are unsure about a fact. For well-established concepts you are confident about, respond directly.

Claude 4.x takes instructions literally. "ALWAYS use this tool" means always. Scale the urgency to match how often you actually want it used.

26. For agentic tasks, have Claude write tests first

Before starting a multi-step coding task, ask Claude to create a test suite. Tests give the model a concrete target and make iteration reliable.

Without tests (drift-prone):

Build a REST API for user management with CRUD operations, authentication, and rate limiting.

With tests first (anchored):

Before writing any code, create a comprehensive test suite for a REST API with these requirements: CRUD operations for users, JWT authentication, rate limiting at 100 requests per minute. Write tests in pytest. Then implement the API to pass all tests.

The tests become the definition of "done." Across context windows, Claude can always run tests to verify it hasn't regressed. Anthropic recommends keeping tests in a structured format like tests.json for long-running tasks.

27. Git is state management for long coding sessions

For coding tasks spanning multiple context windows, Claude can use git logs and diffs to reconstruct state. Git provides better tracking than compacted conversation summaries.

Add to your system prompt:

After completing each meaningful unit of work, commit your changes with a descriptive commit message. When starting a new context window, review the git log and recent diffs to understand the current state before continuing.

Claude 4.x models are exceptionally good at discovering state from the filesystem. In some cases, a fresh context window where Claude reads git history produces better results than compacted conversation summaries.

28. Tell Claude to clean up temporary files

Claude often creates scratch scripts to test ideas during coding tasks. This improves quality but clutters your workspace.

Add to your system prompt:

If you create any temporary files, scripts, or helper files for testing and iteration, clean them up by removing them when the task is complete.

This lets Claude use its scratchpad approach (which Anthropic says improves outcomes) without leaving artifacts behind.

29. Parallel tool calls are free performance

Claude 4.x can fire multiple independent tool calls simultaneously. Reading 5 files? It reads all 5 in one step instead of 5 sequential steps.

Add to your system prompt for maximum parallelism:

If you intend to call multiple tools and there are no dependencies between the calls, make all independent tool calls in parallel. For example, when reading 3 files, read all 3 simultaneously. Never use placeholders or guess missing parameters. If calls depend on each other, execute them sequentially.

The inverse works too. If parallel calls cause bottlenecks: "Execute operations sequentially with brief pauses between each step."

30. Self-check blocks catch mistakes before you do

Append a verification checklist at the end of your prompt. Claude reviews its own output and often catches errors.

Add this after your main instructions:

Before finalizing your response, verify:

Did you follow the requested output format exactly?
Are any claims uncertain? If yes, mark them with \[UNCERTAIN\].
Are all steps actionable and specific (not vague)?
Did you stay within the stated constraints (word count, scope, format)?
Did you address every part of the original question?

This isn't perfect, but it catches a meaningful percentage of formatting errors, missed requirements, and unsupported claims. For production systems, pair this with automated evals for the highest reliability.

Bonus: The meta-tip

The single best way to test whether your prompt is good enough: show it to a colleague who knows nothing about the task and ask them to follow it. If they would be confused, Claude will be too. Prompts are not magic incantations. They are instructions. Write them like a clear email to a new teammate on their first day.

Quick reference table

#TipWhen to use it1Prefill the assistant turnYou need strict format control2Important instructions lastLong prompts with critical constraints3XML tags for structureMultiple data sources or complex prompts4Step-by-step in system promptAny system needing consistent reasoning5Multishot over descriptionOutput format matters6Permission to say "I don't know"Factual or data-driven tasks7Positive instructionsAny formatting or style direction8Documents above, query belowLong-context prompts (20K+ tokens)9Explain the whyComplex constraints or judgment calls10Prompt cachingRepeated API calls with static content11Extended thinking selectivelyComplex reasoning only12Chain promptsMulti-step analysis or generation13Match prompt style to outputAny task with specific tone or format14One example firstFormat-sensitive tasks15Quote before answeringLong-document analysis16Avoid "think" without thinking modeAll prompts without extended thinking17System vs. user separationEvery API implementation18XML-wrapped documentsMulti-document prompts19Be explicit with Claude 4.xEvery interaction20Structured output for JSONAny JSON-output task21Best-of-N verificationHigh-stakes factual tasks22Explicit length constraintsSummaries, snippets, constrained output23Don't over-engineerEvery prompt (start simple)24Caching token minimumsBefore implementing prompt caching25Tool-triggering languageAgent or tool-use systems26Tests first for agentsMulti-step coding tasks27Git for state trackingLong coding sessions28Clean up temp filesAgentic coding tasks29Parallel tool callsMulti-tool agent workflows30Self-check blocksProduction prompts

Sources: Anthropic API Documentation (2026), Anthropic Engineering Blog on Context Engineering, Claude Prompting Best Practices, Claude 4.x Best Practices Guide, Anthropic Prompt Caching Documentation.

Filed under: ai, best practices

Read this article as markdown