April 2026: GPT-5.5, Opus 4.7, DeepSeek V4, AI Trust Issues, Patrol MCP

Opus 4.7, GPT-5.5, and DeepSeek V4 all shipped this month.

Three launches, three different stories. Opus 4.7 and GPT-5.5 claim record benchmark scores, but developer trust is shaky — we'll get into why. DeepSeek V4 barely made English-language headlines, yet its architecture is purpose-built for agents in ways that the US models are not.

The question worth asking this month is not how good is the model? but how do we make AI coding trustworthy, practical, and sustainable?

Before we get to that, let's start with some Flutter news.

Flutter Updates

Flutter-specific news was lighter this month, but two updates are worth highlighting.

📝 Flutter's Material and Cupertino code freeze

The most important near-term Flutter change is probably Flutter's Material and Cupertino code freeze.

After Flutter 3.44, Material and Cupertino will move out of the SDK into separate packages. The migration is meant to be mechanically straightforward, and dart fix should handle most of the work.

This matters because it gives these design systems their own release cadence instead of tying every update to the Flutter SDK cycle. At first glance, this looks like a maintenance change. In practice, it is one of the bigger ecosystem shifts this year.

The earliest discussion about decoupling design in Flutter started all the way back in 2022. You can check the latest progress here.

📝 How Dart and Flutter are thinking about AI in 2026

The other link worth reading is How Dart and Flutter are thinking about AI in 2026.

This is less of a roadmap and more of a strategy note. The main points are:

Build for three groups: traditional developers, AI-assisted developers, and AI-first developers.
Keep Dart human-first: readability and language ergonomics matter.
Use AI to extend the developer experience, not replace it.
Stay agent-agnostic through open standards like MCP.
Reduce the verification tax by improving code quality.

In summary, the Flutter team is focused on consolidation on the framework side, and a more serious AI strategy on the tooling side.

AI News

Three major model launches in one month. You'd think that would be exciting. But the more interesting questions this month aren't about benchmark scores or parameter counts — they're about trust, economics, and what the market keeps overlooking.

📹 Benchmarks Up, Trust Down

On paper, GPT-5.5 and Opus 4.7 show better-than-ever benchmark scores.

But Anthropic users in particular are pushing back. Complaints about declining output quality and stricter usage limits have been growing louder. And when real-world performance gets worse while benchmarks keep climbing, people become skeptical.

This recent video titled "It's all fake" shows just how much these benchmarks are not to be trusted:

Ultimately, it boils down to this:

When a measure becomes a target, it ceases to be a good measure.

That skepticism is not just hype fatigue. It also reflects the economics.

📹 The End of Cheap Tokens?

Here's a reality we need to face: current AI subscriptions are very underpriced relative to their costs.

When these subscriptions launched, most usage was chat — low token counts, low cost. Agentic coding flips that: long-running workflows burn through tokens fast, and the old pricing math no longer works.

If that pressure stays high, expect:

stricter usage limits
more token-based pricing
cheaper plans losing access to the best tools
quieter quality trade-offs behind the scenes

Maximilian Schwarzmüller breaks down the economics of AI subscriptions in this video:

For a more dramatic and very plausible take on where this is all going, watch How OpenAI ends and takes Oracle with it | Ed Zitron. It is a good reality check on inference and infrastructure costs.

📝 Anthropic and the Trust Problem

When a coding tool becomes part of daily work and quality slips, trust breaks fast. Anthropic is the clearest case study this month.

AMD's director of AI collected months of data logs and documented a clear performance decline:

Claude Code is unusable for complex engineering tasks with the Feb updates

Anthropic responded with this postmortem:

An update on recent Claude Code quality reports

It's not the first time Anthropic has had to address quality concerns. Some users suspect they may be out of compute and are quietly reducing costs behind the scenes.

📝 DeepSeek V4: The Model the Headlines Missed

While OpenAI and Anthropic dominate the narrative, DeepSeek V4 shipped something that deserves more attention, with a model that features a 1M context window, and matches frontier closed models on popular agentic coding benchmarks — at a fraction of the cost.

Behind those numbers: a hybrid attention design that keeps inference cheap at long context, interleaved thinking that persists across tool calls, and RL training on real sandboxes.

Read the full details here:

DeepSeek-V4: a million-token context that agents can actually use.

In a way, DeepSeek is making the architectural version of the same bet this newsletter keeps coming back to: better infrastructure matters more than better benchmarks.

📹 Flutter E2E Testing with Patrol MCP

Models matter, but making AI coding trustworthy requires more than that. Verification is a big part of the puzzle — and for mobile apps, that means AI needs a way to actually interact with the app, not just generate code and hope it works.

That's where Patrol MCP comes in.

It gives AI agents a way to write, run, and debug end-to-end tests with Patrol, and this video shows it in action:

LeanCode also released Marionette MCP for more active development workflows.

Latest from Code With Andrea

Last month, I launched my Agentic Coding Toolkit, which brings spec-driven AI workflows to Flutter.

Since then, I've been listening to feedback and improving the core product, and I'm now planning to expand platform support beyond OpenCode and Claude Code.

I've also recorded a video showing what the entire workflow looks like on a real-world brownfield app. If you're curious, you can watch it here:

Spec-Driven Development Walkthrough (Video)

Until Next Time

In last month's newsletter, I said I wanted to publish some new videos about agentic coding, but then I got busy with improvements to my AI toolkit. 😅

New videos are back on the agenda (I promise!), so stay tuned for updates. 🙂

Thanks for reading, and happy coding!