Today's issue, in seven beats —
SAT·02
MAY 26
Ephemeris · A daily typographic magazine
A take-home Opus can't just solve.
Anthropic's hiring team rebuilt their take-home from scratch — twice — after Opus 4 and then Opus 4.5 walked through earlier versions in minutes. The post is the design log of an interview problem engineered against frontier models, with the constraints, the failures, and the lessons for anyone still asking "implement a cache" in 2026.
"The bar isn't 'can a human solve this?' anymore. It's 'can a human solve this in a way the model still can't shortcut?' That's a different design problem.— Anthropic Engineering
Five moves to defend the next decade.
OpenAI lays out a five-part action plan for cybersecurity in the AI era — democratising AI-powered defence, hardening critical systems, and aligning with national-security partners. Worth reading less for the policy framing than for the specific posture they're proposing AI labs and customers adopt.
Tools the agent discovers at runtime.
Claude can now find, learn, and execute tools dynamically — instead of having every JSON schema crammed into the system prompt at session start. The implication, if you're building agents: stop pre-loading the world. Let the model ask for a manual when it needs one.
$ ship --agent --with-judgment_
Vercel publishes a working framework for "agent responsibly" — the difference between leveraging AI and relying on it. Concrete guardrails for code reviews on agent-generated PRs, sandboxing for runtime tool use, and the human-in-the-loop checkpoints that actually matter.
# reviewing an agent-authored PR — the new checklist $ agent.diff --explain # model summarises intent $ agent.tests --run --strict [passed 247 / 247] $ agent.scope --check [bounded to /apps/web] $ agent.prod-keys --grep [NONE — required] $ human.review --required-on-merge [approved by you] # merge unblocked. shipped.
A network that fails small, on purpose.
Cloudflare wraps its multi-quarter "Code Orange" reliability initiative — a top-to-bottom rewrite of how dependencies, blast radii, and recovery paths are designed across the edge. The post-mortem-meets-postcard is required reading for anyone running infrastructure at any scale that involves the word "blast radius."
Code Orange started after a year in which a handful of incidents took out far more of the network than any one of them deserved to. The fix wasn't another runbook. It was an internal mandate to redesign for failure modes that stay small: every dependency labelled, every fault domain bounded, every recovery path practiced before the alarm fires.
The result, the team writes, isn't a network that doesn't fail. It's a network where any single failure stops being a CNN headline. That distinction is the entire point. If you operate something with a control plane and a data plane, this is the postmortem-as-playbook you wanted.
Kimi K2.6 goes for the long session.
Moonshot AI's updated Kimi handles longer autonomous coding sessions and scales up its multi-agent orchestration relative to its predecessor. Open-weights, with measurable gains where most coding agents still drift after an hour. Worth pulling for anyone whose coding-agent budget is starting to look like a salary.
- i.Longer autonomous coding sessions before drift.
- ii.Multi-agent orchestration scaled up.
- iii.Open weights — pull, host, evaluate at home.
- iv.Pricing that pressures closed leaders.
Better agents, with the right plumbing.
Fly walks through using MorphLLM as the editing brain inside an agent loop on Fly Machines — fast file rewriting, scoped sandboxes, and a deployment shape that keeps cost predictable when you're running dozens of agent attempts in parallel. A practical take on the "where does the agent actually live?" question.
The edit step
MorphLLM specialises in one thing — fast, accurate file rewrites — so the orchestrator doesn't burn frontier-model tokens on mechanical patches.
The host shape
Fly Machines spin up per attempt, isolated and disposable. No long-lived "agent server" — just a fleet of short-lived sandboxes.
That's today.
Seven picks, six sources, one Saturday. Tomorrow morning at 08:00 Zürich, again — same rubric, different surface.
Sources today —
Anthropic Engineering · OpenAI · Vercel · Cloudflare · DeepLearning.AI The Batch · Fly.io.
Rubric —
Tools to adopt this week · creative software · dev tools & agentic coding · privacy & security · research with a practical kernel · anything actionable for a senior engineer or founder.
Issue 014 · 02 May 2026 · Zürich.