The Anthropic Claude API is one of the cleanest, most developer-friendly AI APIs available today. Unlike some competitors, Anthropic invested heavily in predictable behaviour, honest limitations, and instruction-following accuracy — qualities that matter enormously in production applications.

This guide is for developers who already know how to code and want to build real applications with Claude — not toy demos. We'll cover the Messages API, streaming, tool use, system prompts, multi-turn conversations, and production patterns in TypeScript.

Prerequisites: Node.js 18+, basic TypeScript, an Anthropic account with API key. All examples use @anthropic-ai/sdk v0.40+. Set ANTHROPIC_API_KEY in your environment.

1. Choosing the Right Model

Anthropic offers several Claude models. Here's a practical guide to which one to use:

Model Best For Context Speed Cost (input/1M)
claude-opus-4-6 Most Capable Complex reasoning, architecture, senior-level code review 200K Slower $15
claude-sonnet-4-6 Recommended Production apps, coding assistant, daily tasks 200K Fast $3
claude-haiku-4-5 High-volume, low-latency tasks, classification 200K Very Fast $0.25

For most production apps: Start with claude-sonnet-4-6. It hits the best balance of intelligence, speed, and cost. Upgrade to Opus only for high-stakes reasoning; downgrade to Haiku for classification, tagging, or other simple high-volume tasks.

2. Installation and First Request

bash
npm install @anthropic-ai/sdk
# Add to .env:
# ANTHROPIC_API_KEY=sk-ant-...
TypeScript
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

async function ask(question: string): Promise<string> {
  const message = await client.messages.create({
    model: 'claude-sonnet-4-6',
    max_tokens: 1024,
    messages: [
      { role: 'user', content: question }
    ],
  });

  // message.content is an array of content blocks
  const block = message.content[0];
  if (block.type === 'text') return block.text;
  throw new Error('Unexpected response type');
}

const answer = await ask('What is the time complexity of quicksort?');
console.log(answer);
Important: message.content is always an array, even for simple text responses. Always check block.type === 'text' before accessing block.text. This becomes critical when you add tool use.

3. System Prompts — Your Most Important Tool

The system field shapes Claude's entire behaviour for the session. A well-crafted system prompt is the single most impactful thing you can do to improve response quality.

TypeScript
const message = await client.messages.create({
  model: 'claude-sonnet-4-6',
  max_tokens: 2048,
  system: `You are a senior TypeScript engineer reviewing code for a financial services startup.

RULES:
- Flag any potential security vulnerabilities immediately with SEVERITY: HIGH/MEDIUM/LOW
- Always explain WHY something is a problem, not just what the problem is
- Suggest specific fixes with code examples
- Check for: SQL injection, XSS, insecure deserialization, improper auth
- Output format: structured JSON with keys: issues[], suggestions[], overall_score (1-10)

CONSTRAINTS:
- Do not rewrite the entire file — only identify and fix specific issues
- If no issues are found, say so explicitly and explain what you checked`,

  messages: [
    { role: 'user', content: `Review this code:\n\`\`\`typescript\n${userCode}\n\`\`\`` }
  ],
});

System Prompt Best Practices

  • Persona first: Start with "You are a..." to set the context immediately
  • Explicit rules: Use numbered or bulleted rules — Claude follows structured instructions very well
  • Output format: Specify JSON, markdown, or prose explicitly to get consistent parsing
  • Constraints matter: Tell Claude what NOT to do — it reduces hallucination and scope creep
  • Keep it focused: One job per system prompt. Don't try to make one prompt do everything

4. Multi-Turn Conversations

The Claude API is stateless — you must send the full conversation history with every request. This is different from some chat APIs that maintain sessions server-side.

TypeScript
import Anthropic from '@anthropic-ai/sdk';
import type { MessageParam } from '@anthropic-ai/sdk';

const client = new Anthropic();

class ConversationSession {
  private history: MessageParam[] = [];
  private system: string;

  constructor(system: string) {
    this.system = system;
  }

  async chat(userMessage: string): Promise<string> {
    // Add user message to history
    this.history.push({ role: 'user', content: userMessage });

    const response = await client.messages.create({
      model: 'claude-sonnet-4-6',
      max_tokens: 2048,
      system: this.system,
      messages: this.history,
    });

    const block = response.content[0];
    if (block.type !== 'text') throw new Error('Expected text response');

    // Add assistant response to history
    this.history.push({ role: 'assistant', content: block.text });

    return block.text;
  }

  reset() { this.history = []; }
  getHistory() { return [...this.history]; }
}

// Usage
const session = new ConversationSession(
  'You are a helpful coding assistant. Keep responses concise.'
);

console.log(await session.chat('How do I read a file in Node.js?'));
console.log(await session.chat('And how do I write to a file?'));
console.log(await session.chat('Can you show me both in a single function?'));
Context window management: At 200K tokens, the context window is large — but not infinite. For long-running sessions, implement a sliding window strategy: keep the system prompt, the last N messages, and a summary of earlier conversation injected as context.

5. Streaming Responses

Streaming is essential for any user-facing chat interface. Without it, users stare at a blank screen for seconds before anything appears. The Anthropic SDK makes streaming straightforward:

TypeScript — Basic Streaming
async function streamResponse(prompt: string): Promise<void> {
  const stream = await client.messages.stream({
    model: 'claude-sonnet-4-6',
    max_tokens: 1024,
    messages: [{ role: 'user', content: prompt }],
  });

  // Stream tokens to stdout as they arrive
  for await (const chunk of stream) {
    if (
      chunk.type === 'content_block_delta' &&
      chunk.delta.type === 'text_delta'
    ) {
      process.stdout.write(chunk.delta.text);
    }
  }

  // Get the final message once streaming is complete
  const finalMessage = await stream.finalMessage();
  console.log('\n\nStop reason:', finalMessage.stop_reason);
  console.log('Input tokens:', finalMessage.usage.input_tokens);
  console.log('Output tokens:', finalMessage.usage.output_tokens);
}
TypeScript — Streaming in Express API
import express from 'express';
import Anthropic from '@anthropic-ai/sdk';

const app = express();
app.use(express.json());
const client = new Anthropic();

app.post('/api/chat', async (req, res) => {
  const { messages, system } = req.body;

  // Set SSE headers
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Connection', 'keep-alive');

  try {
    const stream = await client.messages.stream({
      model: 'claude-sonnet-4-6',
      max_tokens: 2048,
      system,
      messages,
    });

    for await (const chunk of stream) {
      if (
        chunk.type === 'content_block_delta' &&
        chunk.delta.type === 'text_delta'
      ) {
        res.write(`data: ${JSON.stringify({ text: chunk.delta.text })}\n\n`);
      }
    }

    res.write('data: [DONE]\n\n');
    res.end();

  } catch (error) {
    res.write(`data: ${JSON.stringify({ error: 'Stream failed' })}\n\n`);
    res.end();
  }
});

6. Tool Use (Function Calling)

Tool use is where Claude becomes genuinely powerful. You define tools with JSON Schema, Claude decides when to call them, you execute the function, then send the result back. This enables agents, data pipelines, and AI-powered automations.

TypeScript — Tool Definition and Execution
import Anthropic from '@anthropic-ai/sdk';
import type { Tool, MessageParam } from '@anthropic-ai/sdk';

const client = new Anthropic();

// Define your tools
const tools: Tool[] = [
  {
    name: 'get_weather',
    description: 'Get current weather for a city. Returns temperature, condition, and humidity.',
    input_schema: {
      type: 'object',
      properties: {
        city: { type: 'string', description: 'City name, e.g. "London" or "New York"' },
        units: {
          type: 'string',
          enum: ['celsius', 'fahrenheit'],
          description: 'Temperature units. Defaults to celsius.'
        }
      },
      required: ['city']
    }
  },
  {
    name: 'search_database',
    description: 'Search the product database. Returns matching products with price and stock.',
    input_schema: {
      type: 'object',
      properties: {
        query: { type: 'string', description: 'Search query' },
        limit: { type: 'number', description: 'Max results (default 5, max 20)' }
      },
      required: ['query']
    }
  }
];

// Mock tool executors — replace with real implementations
async function executeTool(name: string, input: Record<string, unknown>): Promise<string> {
  if (name === 'get_weather') {
    const { city, units = 'celsius' } = input as { city: string; units?: string };
    // In production: call a real weather API here
    return JSON.stringify({
      city,
      temperature: units === 'celsius' ? 22 : 72,
      condition: 'Partly cloudy',
      humidity: '65%',
      units
    });
  }

  if (name === 'search_database') {
    const { query, limit = 5 } = input as { query: string; limit?: number };
    // In production: query your real database
    return JSON.stringify({
      results: [
        { id: 1, name: `${query} Pro`, price: 99.99, stock: 42 },
        { id: 2, name: `${query} Lite`, price: 49.99, stock: 8 }
      ].slice(0, limit)
    });
  }

  return JSON.stringify({ error: `Unknown tool: ${name}` });
}

// Agentic loop — handles multiple tool calls automatically
async function runAgent(userMessage: string): Promise<string> {
  const messages: MessageParam[] = [
    { role: 'user', content: userMessage }
  ];

  while (true) {
    const response = await client.messages.create({
      model: 'claude-sonnet-4-6',
      max_tokens: 4096,
      tools,
      messages,
    });

    // If Claude is done, return the final text response
    if (response.stop_reason === 'end_turn') {
      const textBlock = response.content.find(b => b.type === 'text');
      return textBlock?.type === 'text' ? textBlock.text : '';
    }

    // Handle tool calls
    if (response.stop_reason === 'tool_use') {
      // Add Claude's response (with tool_use blocks) to history
      messages.push({ role: 'assistant', content: response.content });

      // Execute all tool calls and collect results
      const toolResults = await Promise.all(
        response.content
          .filter(block => block.type === 'tool_use')
          .map(async block => {
            if (block.type !== 'tool_use') return null;
            const result = await executeTool(
              block.name,
              block.input as Record<string, unknown>
            );
            return {
              type: 'tool_result' as const,
              tool_use_id: block.id,
              content: result
            };
          })
      );

      // Add tool results to history and continue the loop
      messages.push({
        role: 'user',
        content: toolResults.filter(Boolean) as Anthropic.ToolResultBlockParam[]
      });
    }
  }
}

// Test it
const result = await runAgent(
  'What\'s the weather in Tokyo right now? Also find me some "wireless headphones" in the store.'
);
console.log(result);
Key insight: The agentic loop runs until stop_reason === 'end_turn'. Claude may call tools multiple times in a single turn, or chain tool calls together. Your loop must handle this automatically — that's the whole architecture.

7. Structured JSON Output

A very common pattern: force Claude to return structured JSON that your application can parse reliably. Two approaches work well:

TypeScript — Approach 1: System Prompt + JSON Mode
import { z } from 'zod';

// Define the expected schema
const ProductSchema = z.object({
  name: z.string(),
  category: z.string(),
  price_range: z.object({
    min: z.number(),
    max: z.number(),
    currency: z.string()
  }),
  features: z.array(z.string()),
  sentiment: z.enum(['positive', 'neutral', 'negative']),
  confidence: z.number().min(0).max(1)
});

type Product = z.infer<typeof ProductSchema>;

async function extractProductData(rawText: string): Promise<Product> {
  const response = await client.messages.create({
    model: 'claude-sonnet-4-6',
    max_tokens: 1024,
    system: `You are a data extraction API. Always respond with valid JSON only.
No markdown, no explanation, no code fences — just the raw JSON object.
Schema: ${JSON.stringify(ProductSchema.shape, null, 2)}`,
    messages: [
      { role: 'user', content: `Extract product data from this text:\n\n${rawText}` }
    ]
  });

  const block = response.content[0];
  if (block.type !== 'text') throw new Error('Expected text');

  const parsed = JSON.parse(block.text);
  return ProductSchema.parse(parsed); // Zod validates AND types the output
}
TypeScript — Approach 2: Tool Use for Guaranteed Structure
// Using a "fake" tool as a structured output mechanism
// Claude MUST call the tool, so output is always structured

const response = await client.messages.create({
  model: 'claude-sonnet-4-6',
  max_tokens: 1024,
  tool_choice: { type: 'tool', name: 'save_product' }, // Force this tool
  tools: [{
    name: 'save_product',
    description: 'Save the extracted product information',
    input_schema: {
      type: 'object',
      properties: {
        name: { type: 'string' },
        price: { type: 'number' },
        in_stock: { type: 'boolean' },
        tags: { type: 'array', items: { type: 'string' } }
      },
      required: ['name', 'price', 'in_stock', 'tags']
    }
  }],
  messages: [
    { role: 'user', content: `Extract: ${rawText}` }
  ]
});

// The tool call input IS your structured data
const toolUse = response.content.find(b => b.type === 'tool_use');
if (toolUse?.type === 'tool_use') {
  const data = toolUse.input; // Guaranteed to match your schema
  console.log(data);
}

8. Error Handling and Retries

Production Claude API applications need robust error handling. The main error types to handle:

TypeScript
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

async function robustCall(
  messages: Anthropic.MessageParam[],
  retries = 3
): Promise<string> {
  for (let attempt = 1; attempt <= retries; attempt++) {
    try {
      const response = await client.messages.create({
        model: 'claude-sonnet-4-6',
        max_tokens: 1024,
        messages,
      });

      const block = response.content[0];
      if (block.type === 'text') return block.text;
      throw new Error('No text in response');

    } catch (error) {
      if (error instanceof Anthropic.RateLimitError) {
        // 429 — wait before retry
        const waitMs = Math.pow(2, attempt) * 1000; // Exponential backoff
        console.warn(`Rate limited. Waiting ${waitMs}ms before retry ${attempt}/${retries}`);
        await new Promise(r => setTimeout(r, waitMs));
        continue;
      }

      if (error instanceof Anthropic.APIError) {
        if (error.status === 529) {
          // API overloaded — brief wait
          await new Promise(r => setTimeout(r, 2000 * attempt));
          continue;
        }
        if (error.status >= 500) {
          // Server error — retry
          await new Promise(r => setTimeout(r, 1000 * attempt));
          continue;
        }
        // 4xx errors (except 429) are not retriable — bad request, auth, etc.
        throw error;
      }

      throw error; // Unknown error — rethrow
    }
  }

  throw new Error(`Failed after ${retries} attempts`);
}

9. Real-World Patterns

1

Code Review Bot in CI/CD

Trigger on PR open via GitHub Actions. Send diff + changed files to Claude with a code review system prompt. Post Claude's feedback as a PR comment automatically. Filter by file type to only review TypeScript/PHP/etc. Haiku is cost-effective for this at scale.

2

Document Processing Pipeline

Feed PDFs/Word docs as extracted text. Use structured output (tool use pattern) to extract entities, dates, amounts, clauses. Store in a database. Claude Sonnet handles 200K tokens — most legal or financial docs fit in one call without chunking.

3

Customer Support Agent

System prompt defines product knowledge, escalation rules, and tone. Tool use connects to your CRM (order lookup, account status). Conversation history provides context. Use Claude Haiku for first-pass triage, escalate to Sonnet for complex issues.

4

RAG (Retrieval-Augmented Generation)

Embed user queries with any embedding model. Retrieve top-K relevant chunks from a vector database (Pinecone, pgvector, Weaviate). Inject chunks into the system prompt as context. Claude synthesises an answer grounded in your data, not hallucinated.

5

Multi-Agent Orchestration

Use one Claude instance as an "orchestrator" that breaks tasks into subtasks and dispatches them to specialised "worker" agents (each with their own system prompt and tools). The orchestrator collects results and synthesises the final output. Powerful for complex multi-step workflows.

10. Cost Optimisation

Claude API costs scale with tokens. Here are the strategies that make the biggest difference:

  • Use the right model for the task. Haiku costs 12x less than Sonnet. For classification, routing, and simple extraction, Haiku is more than sufficient.
  • Prompt caching (beta): If your system prompt is large and repeated, enable prompt caching to reduce input costs by up to 90% on cached tokens.
  • Set appropriate max_tokens. If you know responses will be short, set a lower limit — you're not billed for unused tokens, but it prevents runaway responses.
  • Compress conversation history. For long sessions, periodically ask Claude to summarise earlier exchanges and replace the full history with the summary.
  • Batch non-urgent tasks using the Batch API for up to 50% cost reduction on async workloads.
Practical cost estimate: A customer support bot handling 1,000 conversations/day (average 10 exchanges, ~500 tokens each) using claude-sonnet-4-6 costs roughly $45–90/day depending on response length. Switching to Haiku for initial triage (80% of cases) drops this to ~$15–20/day.

11. Production Checklist

  • ✅ API key stored in environment variables, never hardcoded
  • ✅ Exponential backoff and retry logic for 429/5xx errors
  • max_tokens set appropriately to cap costs and prevent runaway responses
  • ✅ Input validation before sending to API (sanitise user content)
  • ✅ Logging: log token usage per request for cost monitoring
  • ✅ Timeout set on HTTP client (default can be too long)
  • ✅ Rate limiting on your own API endpoints to prevent abuse
  • ✅ System prompt tested for prompt injection resistance
  • ✅ Structured output validated with Zod or equivalent before use
  • ✅ Model version pinned (avoid unexpected behaviour from model updates)

FAQs

How do I get an Anthropic API key?

Sign up at console.anthropic.com. New accounts get free credits to start. Production usage is pay-as-you-go, billed by tokens. No monthly minimum.

What's the difference between claude-sonnet-4-6 and claude-opus-4-6?

Opus is Anthropic's most capable model — better at complex reasoning, nuanced instructions, and difficult multi-step tasks. Sonnet is faster and 5x cheaper while still being excellent for most production use cases. Start with Sonnet and only upgrade to Opus if you see clear quality gaps.

Can I use the Claude API for commercial applications?

Yes. The Anthropic API is available for commercial use under Anthropic's usage policies. Review the policies at anthropic.com/legal/aup for any content restrictions relevant to your use case.

Does the Claude API support images?

Yes. Claude is multimodal — you can include images in messages as base64-encoded data or public URLs. Image tokens are counted separately from text tokens.

Need a Claude API Integration Built?

I build production Claude API integrations, AI chatbots, document processing pipelines, and automation workflows. Free 30-minute technical consultation.

Get Free Consultation →
Anju Batta
Anju Batta

Senior Full Stack Developer & AI Automation Architect. I build commercial applications with the Claude API, OpenAI API, and automation tools like n8n. Based in Chandigarh, India.

Read Next: Claude vs ChatGPT vs Gemini Comparison →