Welcome to this week’s roundup of AI news that actually matters if you code with AI. As always: every item below was verified against its primary source, and anything we couldn’t confirm — or that won’t matter in a month — got cut. New here? Start with our AI for Developers hub, and catch up on last week’s roundup.
Models
Claude Sonnet 5: near-Opus agentic coding at mid-tier prices
Anthropic shipped Claude Sonnet 5 on June 30, and it’s the most interesting model release of the week for working developers. The pitch is simple: agentic performance close to Opus 4.8 — planning, tool use, terminals, long multi-step coding runs — at a fraction of the cost. It launches at an introductory $2/M input and $10/M output tokens through August 31 (then $3/$15), and it’s now the default model on Claude Free and Pro plans, plus Claude Code and the API as claude-sonnet-5. One thing to budget for: it uses a new tokenizer that can map the same input to roughly 1.0–1.35x more tokens, which the intro pricing is designed to offset. If you’ve been paying Opus prices for agent workloads, this is the one to benchmark first.
OpenAI previews GPT-5.6 Sol, Terra, and Luna
On June 26 OpenAI announced the GPT-5.6 family: Sol (flagship), Terra (GPT-5.5-level performance at half the cost), and Luna (fastest, cheapest). For coding, Sol claims a new state of the art on Terminal-Bench 2.1, and the release adds a “max” reasoning effort plus an “ultra” mode that spins up subagents for complex work. Pricing is already published — Sol $5/$30, Terra $2.50/$15, Luna $1/$6 per million tokens — along with more predictable prompt caching (explicit cache breakpoints, 30-minute minimum cache life). The catch: it’s a limited preview for selected partners via the API and Codex for now, with general availability promised “in the coming weeks.” Real, verified, and worth planning around — but you probably can’t use it today.
Claude Fable 5 is back online globally
If you lost access to Anthropic’s Fable 5 in June, it’s back. The US export controls applied on June 12 were lifted June 30, and access was restored globally on July 1 across the Claude API, claude.ai, and Claude Code. Pro, Max, Team, and select Enterprise plans get Fable 5 included for up to 50% of weekly usage limits through July 7, after which it moves to usage credits. Anthropic also shipped a stricter safety classifier as part of the resolution — expect somewhat more false-positive blocks on routine security-adjacent coding and debugging, with blocked requests falling back to Opus 4.8.
Tools
A busy week in GitHub Copilot
GitHub’s changelog was unusually dense this week. The highlights for daily use: Claude Sonnet 5 is now generally available in Copilot (June 30) and Kimi K2.7 joined the model picker a day later, so you can try both new models without leaving your editor. For CI users, Copilot CLI no longer needs a personal access token in GitHub Actions — a genuine quality-of-life fix that removes a common secret-management headache. And browser tools for Copilot in VS Code hit general availability, letting the agent drive a browser to verify its own front-end changes. Heads-up if you pinned older Google models: Gemini 2.5 Pro and Gemini 3 Flash are being deprecated in Copilot.
Agents & frameworks
Google ADK 2.0: deterministic workflows around your agents
Google published a detailed look at why it built ADK 2.0, and the design is worth your attention even if you don’t use Google’s stack. The Agent Development Kit now includes a graph-based workflow engine that lets you keep deterministic steps (tool calls, routing, human-in-the-loop approvals) in plain code and reserve the LLM only for nodes that need actual reasoning. Google’s illustrative benchmark shows roughly 50% token savings and 20% lower latency versus a vanilla prompt-orchestrated agent, and the structural separation also blunts prompt-injection attacks, since a manipulated LLM node has no graph edges to unauthorized actions. Workflows are available in Python and newly in Go. If you’re weighing options, see our comparison of open-source agent frameworks.
Genkit’s new Agents API: full-stack conversational agents
Also from the Google open-source stable, Firebase’s Genkit framework launched an Agents API in preview (TypeScript and Go) on July 1. It packages the plumbing every chat-style agent app needs — message history, the tool loop, streaming, session persistence, and a client wire protocol — behind a single chat() interface that works the same in-process or over HTTP. Nice touches include interruptible tools for human approval before risky actions, detachable long-running turns you can reconnect to by snapshot ID, and a Vercel AI SDK useChat adapter. It’s Beta and APIs may break, but if you’re hand-rolling chat infrastructure for a product feature, this is a lot of boilerplate deleted.
Worth a look
A reminder that lands harder every week: model prices and lineups are moving fast — Sonnet 5’s intro pricing and GPT-5.6’s tier restructure both reshuffle the cost math. Our guide on cutting LLM API costs covers the durable tactics (caching, routing, effort levels) that survive every reprice. And if you’re re-evaluating your default coding model after this week, our Claude vs ChatGPT vs Gemini for coding comparison is a good starting point.
That’s it for this week. Everything above links to a primary source — no rumors, no vaporware. See you next Friday.