EPHEMERIS No. 002 Mon · 20 Apr 2026 · Zürich

Ephemeris.

Ten picks from seven corners of the web — heavy on dev primitives today, because the platforms all shipped at once.

01 / 10 OpenAI · Dev Tools · Agents
The Cover Story

Codex is now the whole desk.

OpenAI's Codex app ships a bundle that used to be four separate products: computer use, in-app browsing, image generation, memory, and a plugin surface. The framing is modest — "developer workflows" — but the direction is not. Codex is being positioned as the layer where your other tools become callable, on both macOS and Windows. If you've been waiting for the moment to move serious work out of the browser and into an assistant, this is it.

Read on OpenAI →
02 / 10 Anthropic Engineering · Agentic Coding
Engineering

A safer way to skip permissions.

Claude Code's new auto mode sits between "approve every tool call" and "YOLO --dangerously-skip-permissions." Anthropic's engineers describe a sandbox-plus-policy layer that lets an agent run unattended while still refusing the operations you never want it to try. The post is a recipe for how to wire one up in your own harness.

Don't remove the prompts. Remove the need for prompts.
Read the engineering note →
03 / 10 Cloudflare · Research · Inference

22%

Research

Lossless LLM compression, inference-time.

Cloudflare's research team describes Unweight, a compression scheme that shaves up to 22% off a model's footprint with no measurable quality loss. It's applied at inference, which means no retraining, no quantization knobs — just smaller weights flowing through the same pipeline. They're deploying it across the network.

22%
Footprint cut
0
Quality loss
Provider-agnostic
Read the paper →
04 / 10 Fly.io · Storage · SQLite
~/app $ sqlite3 notes.db "insert into ..." && sync→s3

SQLite that writes through S3.

Ben Johnson's new Litestream Writable VFS turns the pattern that powered half the indie stack — SQLite + replicated streams — into something that writes. The database file is virtual; the pages live in S3-compatible storage; you keep the `sqlite3` binary.

The practical upshot is that you can put a SQLite-shaped workload on serverless compute without reaching for Postgres or worrying about which pod owns the disk.

  ╭──────────────────────────╮
  │  litestream vfs          │
  │    db  →  pages  →  s3   │
  │    ▓▓▓▓▓▓▓▓▓▓▓▓▓▓░  91%  │
  ╰──────────────────────────╯
$ open fly.io/blog →
05 / 10 Anthropic Engineering · Evals
Deep dive

The infra is the benchmark.

Anthropic's engineers ran the same agentic coding eval on the same model, over and over, varying only the surrounding harness — CPU, disk, network jitter, container image. The spread in scores was large enough to reorder the leaderboard. If you're reading a benchmark, you're reading a benchmark of the pipeline.

The practical advice is to run every new model on your own harness before you believe anyone's numbers, including your own from last week.

The broader point is that "X% on SWE-bench" is a claim about a test rig, and test rigs do not stand still. Shared rigs, with logged environments, are the only version that scales.

Useful nerd-snipe: write down your harness version next to every eval you run.

Read the full study →
06 / 10 Import AI · Research & Safety
Issue · 453

Six ways to break an AI agent.

Jack Clark's weekly rounds up a new paper taxonomising six classes of attack that land against agents operating in the real world — prompt smuggling, environment poisoning, tool-call hijacks, and three quieter ones. Read it as a defender's checklist. The companion item on "gradual disempowerment" is more uncomfortable and harder to action.

6 Attack classes · against live agents
Read Import AI 453 →
07 / 10 GitHub Engineering · Infra & Security
Field report · deployment

Catch bad deploys in the kernel.

Lawrence Gripper and Aleksey Levenstein walk through how GitHub uses eBPF probes to detect circular dependencies in their deployment tooling before they ship. The win: the check runs everywhere, in the kernel, without a sidecar. The broader win: once you see syscalls as first-class data, a class of "works on my machine" bugs disappears.

Bad-deploy rate, weekly, after rolling the probe across the fleet. Source: GitHub Engineering.
Read the postmortem →
08 / 10 via @denissexy · Creative AI
Four at once

One photo, a whole 3D world.

See the channel →
09 / 10 Vercel · Durable Execution
Platform update

Long-running functions, finally boring.

Vercel Workflows went GA this week — TypeScript or Python functions that survive process restarts, retries, and hours-long external waits without a separate orchestrator. The pitch is simple: write your code as if it runs for 40 minutes; the platform handles the durable part. If you've been queueing jobs just to get checkpointing, delete that queue.

$ cat workflow.ts
export default workflow(async (step) => {
  const job = await step.call(startJob)
  await step.sleep("1h")  // survives restarts
  return step.call(finalize, job)
})
$ _
Read the launch note →
10 / 10 Vercel · Engineering Deep-Dive
Cold-start recipe

How to shave a sandbox.

Vercel's sandbox team walks through the five knobs they turned to cut snapshot-restore time on the hot path. None is novel on its own; the payoff is in stacking them.

01
Parallel fetch
Snapshot chunks
02
Stream decompress
No staging file
03
Local NVMe
Hot cache
04
Page-in on demand
Lazy restore
05
Warm pool
Zero cold
Read the deep-dive →
End of issue 002Back to top ↑

That's all for today.

Ten picks from OpenAI, Anthropic Engineering, Cloudflare, Fly.io, Import AI, GitHub, Vercel, and one Telegram chaser. Back tomorrow at 08:00 Zürich.

Sources

openai.com/news
anthropic.com/engineering
blog.cloudflare.com
fly.io/blog
jack-clark.net
github.blog/engineering
vercel.com/blog
t.me/denissexy

Rubric

AI tools · creative software · dev tools · privacy · science · the practical. If we can't imagine you using it tomorrow, it doesn't run.

Colophon

Set in Fraunces & Inter (Google Fonts), with JetBrains Mono on the terminal pages. Hand-laid in HTML/CSS. Issue 002 assembled 08:00 CET, Mon 20 Apr 2026. Sentry & PostHog blogs declined to render; noted and skipped.