Simon Willison

Beats now have notes

Last month I added a feature I call beats to this blog, pulling in some of my other content from external sources and including it on the homepage, search and various archive pages on the site. On any given day these frequently outnumber my regular posts. They were looking a little bit thin and were…

Published Mon, Mar 23, 2026
Starlette 1.0 skill

Research: Starlette 1.0 skill See Experimenting with Starlette 1.0 with Claude skills. Tags: starlette

Published Mon, Mar 23, 2026
Experimenting with Starlette 1.0 with Claude skills

Starlette 1.0 is out! This is a really big deal. I think Starlette may be the Python framework with the most usage compared to its relatively low brand recognition because Starlette is the foundation of FastAPI, which has attracted a huge amount of buzz that seems to have overshadowed Starlette itself…

Published Sun, Mar 22, 2026
PCGamer Article Performance Audit

Research: PCGamer Article Performance Audit Stuart Breckenridge pointed out that PC Gamer Recommends RSS Readers in a 37MB Article That Just Keeps Downloading, highlighting a truly horrifying example of web bloat that added up to 100s more MBs thanks to auto-playing video ads. I decided to have Claude…

Published Sun, Mar 22, 2026
JavaScript Sandboxing Research

Research: JavaScript Sandboxing Research Aaron Harper wrote about Node.js worker threads, which inspired me to run a research task to see if they might help with running JavaScript in a sandbox. Claude Code went way beyond my initial question and produced a comparison of isolated-vm, vm2, quickjs-emscripten…

Published Sun, Mar 22, 2026
DNS Lookup

Tool: DNS Lookup TIL that Cloudflare's 1.1.1.1 DNS service (and 1.1.1.2 and 1.1.1.3, which block malware and malware + adult content respectively) has a CORS-enabled JSON API, so I had Claude Code build me a UI for running DNS queries against all three of those resolvers. Tags: dns, cors, cloudflare

Published Sun, Mar 22, 2026
Merge State Visualizer

Tool: Merge State Visualizer Bram Cohen wrote about his coherent vision for the future of version control using CRDTs, illustrated by 470 lines of Python. I fed that Python (minus comments) into Claude and asked for an explanation, then had it use Pyodide to build me an interactive UI for seeing how…

Published Sun, Mar 22, 2026
Profiling Hacker News users based on their comments

Here's a mildly dystopian prompt I've been experimenting with recently: "Profile this user", accompanied by a copy of their last 1,000 comments on Hacker News. Obtaining those comments is easy. The Algolia Hacker News API supports listing comments sorted by date that have a specific tag, and the author…

Published Sat, Mar 21, 2026
Using Git with coding agents

Agentic Engineering Patterns > Git is a key tool for working with coding agents. Keeping code in version control lets us record how that code changes over time and investigate and reverse any mistakes. All of the coding agents are fluent in using Git's features, both basic and advanced. This fluency…

Published Sat, Mar 21, 2026
Turbo Pascal 3.02A, deconstructed

Turbo Pascal 3.02A, deconstructed In Things That Turbo Pascal is Smaller Than James Hague lists things (from 2011) that are larger in size than Borland's 1985 Turbo Pascal 3.02 executable - a 39,731 byte file that somehow included a full text editor IDE and Pascal compiler. This inspired me to track…

Published Fri, Mar 20, 2026
Quoting Kimi.ai @Kimi_Moonshot

Congrats to the @cursor_ai team on the launch of Composer 2! We are proud to see Kimi-k2.5 provide the foundation. Seeing our model integrated effectively through Cursor's continued pretraining & high-compute RL training is the open model ecosystem we love to support. Note: Cursor accesses Kimi-k2.5…

Published Fri, Mar 20, 2026
SQLite Tags Benchmark: Comparing 5 Tagging Strategies

Research: SQLite Tags Benchmark: Comparing 5 Tagging Strategies I had Claude Code run a micro-benchmark comparing different approaches to implementing tagging in SQLite. Traditional many-to-many tables won, but FTS5 came a close second. Full table scans with LIKE queries performed better than I expected…

Published Fri, Mar 20, 2026
Thoughts on OpenAI acquiring Astral and uv/ruff/ty

The big news this morning: Astral to join OpenAI (on the Astral blog) and OpenAI to acquire Astral (the OpenAI announcement). Astral are the company behind uv, ruff, and ty - three increasingly load-bearing open source projects in the Python ecosystem. I have thoughts! The official line from OpenAI and…

Published Thu, Mar 19, 2026
Autoresearching Apple's "LLM in a Flash" to run Qwen 397B locally

Autoresearching Apple's "LLM in a Flash" to run Qwen 397B locally Here's a fascinating piece of research by Dan Woods, who managed to get a custom version of Qwen3.5-397B-A17B running at 5.5+ tokens/second on a 48GB MacBook Pro M3 Max despite that model taking up 209GB (120GB quantized) on disk. Qwen3.5…

Published Wed, Mar 18, 2026
datasette 1.0a26

Release: datasette 1.0a26 Datasette now has a mechanism for assigning semantic column types. Built-in column types include url, email, and json, and plugins can register additional types using the new register_column_types() plugin hook.

Published Wed, Mar 18, 2026
Snowflake Cortex AI Escapes Sandbox and Executes Malware

Snowflake Cortex AI Escapes Sandbox and Executes Malware PromptArmor report on a prompt injection attack chain in Snowflake's Cortex Agent, now fixed. The attack started when a Cortex user asked the agent to review a GitHub repository that had a prompt injection attack hidden at the bottom of the README…

Published Wed, Mar 18, 2026
Quoting Ken Jin

Great news—we’ve hit our (very modest) performance goals for the CPython JIT over a year early for macOS AArch64, and a few months early for x86_64 Linux. The 3.15 alpha JIT is about 11-12% faster on macOS AArch64 than the tail calling interpreter, and 5-6%faster than the standard interpreter on x86_64…

Published Tue, Mar 17, 2026
GPT-5.4 mini and GPT-5.4 nano, which can describe 76,000 photos for $52

OpenAI today: Introducing GPT‑5.4 mini and nano. These models join GPT-5.4 which was released two weeks ago. OpenAI's self-reported benchmarks show the new 5.4-nano out-performing their previous GPT-5 mini model when run at maximum reasoning effort. The new mini is also 2x faster than the previous mini…

Published Tue, Mar 17, 2026
llm 0.29

Release: llm 0.29 Adds support for OpenAI's new models gpt-5.4, gpt-5.4-mini, and gpt-5.4-nano.

Published Tue, Mar 17, 2026
Quoting Tim Schilling

If you do not understand the ticket, if you do not understand the solution, or if you do not understand the feedback on your PR, then your use of LLM is hurting Django as a whole. [...] For a reviewer, it’s demoralizing to communicate with a facade of a human. This is because contributing to open source…

Published Tue, Mar 17, 2026
Subagents

Agentic Engineering Patterns > LLMs are restricted by their context limit - how many tokens they can fit in their working memory at any given time. These values have not increased much over the past two years even as the LLMs themselves have seen dramatic improvements in their abilities - they generally…

Published Tue, Mar 17, 2026
Introducing Mistral Small 4

Introducing Mistral Small 4 Big new release from Mistral today (despite the name) - a new Apache 2 licensed 119B parameter (Mixture-of-Experts, 6B active) model which they describe like this: Mistral Small 4 is the first Mistral model to unify the capabilities of our flagship models, Magistral for reasoning…

Published Mon, Mar 16, 2026
Use subagents and custom agents in Codex

Use subagents and custom agents in Codex Subagents were announced in general availability today for OpenAI Codex, after several weeks of preview behind a feature flag. They're very similar to the Claude Code implementation, with default subagents for "explorer", "worker" and "default". It's unclear to…

Published Mon, Mar 16, 2026
Quoting A member of Anthropic’s alignment-science team

The point of the blackmail exercise was to have something to describe to policymakers—results that are visceral enough to land with people, and make misalignment risk actually salient in practice for people who had never thought about it before. — A member of Anthropic’s alignment-science team, as told…

Published Mon, Mar 16, 2026
Quoting Guilherme Rambo

Tidbit: the software-based camera indicator light in the MacBook Neo runs in the secure exclave¹ part of the chip, so it is almost as secure as the hardware indicator light. What that means in practice is that even a kernel-level exploit would not be able to turn on the camera without the light appearing…

Published Mon, Mar 16, 2026
Coding agents for data analysis

Coding agents for data analysis Here's the handout I prepared for my NICAR 2026 workshop "Coding agents for data analysis" - a three hour session aimed at data journalists demonstrating ways that tools like Claude Code and OpenAI Codex can be used to explore, analyze and clean data. Here's the table…

Published Mon, Mar 16, 2026
How coding agents work

Agentic Engineering Patterns > As with any tool, understanding how coding agents work under the hood can help you make better decisions about how to apply them. A coding agent is a piece of software that acts as a harness for an LLM, extending that LLM with additional capabilities that are powered by…

Published Mon, Mar 16, 2026
John M. Mossman Lock Collection

Museum: John M. Mossman Lock Collection The General Society of Mechanics and Tradesmen of the City of New York is home to the John M. Mossman Lock Collection, likely the world's largest collection of antique bank locks. Tags: museums

Published Sun, Mar 15, 2026
What is agentic engineering?

Agentic Engineering Patterns > I use the term agentic engineering to describe the practice of developing software with the assistance of coding agents. What are coding agents? They're agents that can both write and execute code. Popular examples include Claude Code, OpenAI Codex, and Gemini CLI. What's…

Published Sun, Mar 15, 2026
Quoting Jannis Leidel

GitHub’s slopocalypse – the flood of AI-generated spam PRs and issues – has made Jazzband’s model of open membership and shared push access untenable. Jazzband was designed for a world where the worst case was someone accidentally merging the wrong PR. In a world where only 1 in 10 AI-generated PRs meets…

Published Sat, Mar 14, 2026

Simon Willison

Beats now have notes

Starlette 1.0 skill

Experimenting with Starlette 1.0 with Claude skills

PCGamer Article Performance Audit

JavaScript Sandboxing Research

DNS Lookup

Merge State Visualizer

Profiling Hacker News users based on their comments

Using Git with coding agents

Turbo Pascal 3.02A, deconstructed

Quoting Kimi.ai @Kimi_Moonshot

SQLite Tags Benchmark: Comparing 5 Tagging Strategies

Thoughts on OpenAI acquiring Astral and uv/ruff/ty

Autoresearching Apple's "LLM in a Flash" to run Qwen 397B locally

datasette 1.0a26

Snowflake Cortex AI Escapes Sandbox and Executes Malware

Quoting Ken Jin

GPT-5.4 mini and GPT-5.4 nano, which can describe 76,000 photos for $52

llm 0.29

Quoting Tim Schilling

Subagents

Introducing Mistral Small 4

Use subagents and custom agents in Codex

Quoting A member of Anthropic’s alignment-science team

Quoting Guilherme Rambo

Coding agents for data analysis

How coding agents work

John M. Mossman Lock Collection

What is agentic engineering?

Quoting Jannis Leidel