Claude 3.6: Anthropic's Biggest Model Update of 2025

What Changed in 3.6

Claude 3.6 represented an incremental but meaningful update across Anthropic's model lineup. Released December 2, 2025, the 3.6 update focused on three areas where enterprise customers had provided the most feedback: tool use reliability, instruction following precision, and refusal calibration. Anthropic described 3.6 as a "polish and reliability" release rather than a capability breakthrough — consolidating gains rather than pushing new frontiers.

Tool Use Improvements

Tool use — the ability to call external functions, APIs, and services in a structured way — was significantly more reliable in 3.6. The most common tool use failure mode in 3.5 (the model generating syntactically invalid tool call arguments or calling tools in the wrong sequence for multi-step tasks) was dramatically reduced. In Anthropic's internal agentic task benchmarks, successful tool use completion rates increased by 23% for complex multi-tool scenarios.

Instruction Following

On Anthropic's IFEval benchmark (instruction following evaluation), 3.6 showed a 15% improvement over 3.5. Practical improvements included: more consistent adherence to output length specifications, more reliable format constraints (JSON output staying valid, tables maintaining consistent column structure), and better performance on complex conditional instructions ("if the user asks about X, respond with Y, otherwise respond with Z").

Refusal Calibration

Anthropic recalibrated Claude 3.6's safety behaviours based on extensive feedback about over-refusals in legitimate professional contexts. The model became more willing to discuss topics like medication dosages in medical professional contexts, legal strategy in attorney-client simulations, and security vulnerability analysis for authorised penetration testing scenarios. The recalibration reduced false-positive refusals by 40% while maintaining protection against genuinely harmful requests.

What This Means for Indian Businesses

Claude 3.6's improved tool use reliability directly benefits Indian developers building agentic systems on the Anthropic API. Fewer failed tool calls means fewer human interventions in automated workflows. The reduced refusal rate on edge cases — particularly for legitimate business requests in finance, legal, and healthcare that earlier Claude versions occasionally over-refused — makes Claude 3.6 more useful for the real-world messiness of Indian business operations.