The prompt engineering techniques that worked wonders in 2023 can actually hurt your results with modern LLMs. As models have grown more capable, they've also become more opinionated about how they receive instructions. The "pretend you're an expert" and "take a deep breath" hacks that once boosted performance now often trigger verbose, over-qualified responses.
After working with Claude 4, GPT-5, and Gemini 2 extensively, I've identified the prompting patterns that consistently produce better results. These aren't theoretical—they're battle-tested techniques I use daily when building AI-powered features.
How Chain-of-Thought Prompting Has Evolved
Chain-of-thought (CoT) prompting revolutionized LLM performance when it was introduced. The basic idea—asking the model to "think step by step"—remains valid, but the implementation has evolved significantly.
The Problem with Naive CoT
Modern models are already trained to reason internally. Explicitly asking them to "think step by step" often produces unnecessarily verbose outputs:
❌ Old Pattern (2023):
"Think step by step. First, consider... Then, analyze..."
Result: Model produces lengthy reasoning that doesn't add value.Modern CoT Approaches
Instead of explicit chain-of-thought instructions, structure your prompts to naturally elicit reasoning:
✅ Modern Pattern:
"Analyze the following code for security vulnerabilities.
For each vulnerability found:
1. Identify the specific line(s) affected
2. Explain the attack vector
3. Provide a fix with code"This achieves the same reasoning depth without the artificial prefix.
Structured Output Prompting: JSON and XML
Getting reliable, parseable output from LLMs is crucial for programmatic use. Modern APIs offer native structured output support:
import { z } from 'zod'
const AnalysisSchema = z.object({
sentiment: z.enum(['positive', 'negative', 'neutral']),
confidence: z.number().min(0).max(1),
keyTopics: z.array(z.string()),
summary: z.string()
})
const response = await client.messages.create({
model: 'claude-4-sonnet-20260215',
max_tokens: 1024,
messages: [{ role: 'user', content: `Analyze: "${review}"` }],
response_format: {
type: 'json_schema',
json_schema: zodToJsonSchema(AnalysisSchema)
}
})For complex, nested outputs, XML often works better than JSON in prompts—Claude in particular handles XML structure exceptionally well.
Using Negative Constraints for Better Results
One of the most underutilized prompting techniques is negative constraints—telling the model what NOT to do:
✅ Good Negative Constraints:
- "Do not include explanatory preamble"
- "Do not apologize or express uncertainty"
- "Do not suggest alternatives unless asked"
- "Do not wrap code in markdown unless specified"
- "Output only the code, nothing else"This produces clean, focused output without the model's natural tendency to over-explain.
Few-Shot vs Zero-Shot: When to Use Each
Modern models handle most common tasks effectively without examples:
- Zero-shot works for: Standard formatting, well-defined transformations, common programming patterns
- Few-shot essential for: Custom formats, domain-specific conventions, tone matching, edge cases
The Quality Over Quantity Rule: If using few-shot, two or three high-quality, diverse examples outperform many similar ones.
Building Reusable Prompt Templates
interface PromptTemplate {
system: string
userTemplate: string
variables: string[]
outputFormat?: z.ZodSchema
}
const codeReviewTemplate: PromptTemplate = {
system: `You are a senior software engineer conducting code reviews.
Focus on: security, performance, maintainability.
Be direct and actionable.`,
userTemplate: `Review this {{language}} code:
\`\`\`{{language}}
{{code}}
\`\`\`
Focus areas: {{focusAreas}}`,
variables: ['language', 'code', 'focusAreas']
}Treat prompts like code—version control them, test them, and iterate based on real outputs.
Conclusion: Principles Over Patterns
The specific patterns will evolve as models improve. What won't change are the underlying principles: structure implies reasoning, explicit beats implicit, use native features, and iterate on real outputs. The best prompt engineers systematically test, measure, and refine their prompts based on actual results.



