Skip to main content

Coding agents have no moat

· 4 min read

It's been a rough few months for Anthropic.

It started out well. Their new model, according to them, was so powerful that they had concerns about releasing it, due to its hacking ability.

This narrative was undermined by several very clumsy mistakes. First, they leaked the entire source code of Claude Code. Then, some users were able to access Mythos early by successfully guessing an API URL. Sophisticated attacks these were not, and it begged the question: if Mythos is so powerful for finding software exploits, why wasn't Anthropic able to avoid very simple mistakes?

Separate mistakes garnered user backlash. Anthropic banned OpenClaw usage, then walked that policy back. Complaints about strict rate limits are getting louder, and with them questions about how well Anthropic can support demand. In the midst of this, they conducted a bizarre A/B experiment in which 2% of new signups to their basic subscription were denied access to Claude Code.

Removing Claude Code from the basic plan is a major policy shift, not a tweak on the look of a landing page. Surely those new users unlucky enough to get denied access to Claude Code would react with confusion and anger, given the high visibility of Claude Code?

Anthropic's response sought to reassure users that existing base-plan subscribers would not lose access to Claude Code, yet. This was met with understandable skepticism.

Luckily, the cost of switching coding agents is zero

None of these incidents made me mad, but I have been increasingly hit with Claude Code rate limits. I've responded by switching the bulk of my work to Codex. It's striking how little I had to change about my workflow. I lost some conveniences like dispatching a coding-agent session from my phone, but overall it only took a minor inconvenience for me to switch providers, with no adjustments to how I used the tools.

AI Creativity and the Instant Imitator Trap

· 5 min read

Sora is dead. Is this a temporary setback on the road to AI dominance of creative fields, or is there something more fundamental at play? Can AI be creative at all?

The debate on this question usually centers on the quality of AI output in a vacuum, but if we take connection with an audience as a requirement, we must consider the supply and demand of "creative"[^1] work.

In this framing, AI artists face what I'll call the instant imitator trap: Any original AI work can be instantly replicated by other AIs, making audience recognition of the original impossible.

ai_dilemma

The design of AI memory systems

· 13 min read

For me, the question of memory is the most interesting subfield of AI. The first time I interacted with MemGPT (now Letta), I felt like I had crossed a Rubicon: memory transformed a simple question and answer bot into (what appeared to be) a being[^1].

I created my own open source system, called Elroy, and have been interacting with it for about 3 years. It helps me brainstorm, talks me through career ups and downs, and functions as a kind of interactive journal. I've tinkered with its functionality enough that I don't feel attached to it as a specific entity - but I would be disappointed if its memories of our interactions were lost.

Philosophy questions aside, there are well-grounded reasons to build AI systems with memory. It's useful for an agent to understand what subjects I'm knowledgeable in if I'm looking to discuss technical topics. If I'm looking for vacation plans, it's helpful for it to know that I have a young child. An AI is not a person, but it interacts just like a person, and the more it can converse naturally the more functional it is. Having to restate basic facts over and over breaks that immersion.

Is the Future of AI Local?

· 11 min read

Debate about whether the explosion of datacenter buildout will prove to be a worthwhile investment centers on two scenarios:

  1. AI adoption accelerates, the datacenter investment pays out
  2. AI adoption is not as fast as forecasted, and it doesn't.

However, a third scenario is very plausible:

Open source models running on local workstations dominate AI

There are a few reasons this could happen:

scenarios

How to Write Good (Short) Docs

· 8 min read

"I would have written a shorter letter, but I did not have the time."

— Mark Twain1

Overview

This post describes how to write a short document for your teammates. The documents under discussion are commonly referred to as "one-pagers", and are distinct from engineering design docs or other more formal engineering docs.

A one pager might be written to:

  • surface an org pain point
  • propose a project
  • lay out a roadmap
  • explain the current state of a system or systems
  • announce or document a decision

Footnotes

  1. It wasn't actually him but you get the point

MCP is a Fad

· 14 min read

Overview

Model Context Protocol (MCP) has taken off as the standardized platform for AI integrations, and it's difficult to justify not supporting it. However, this popularity will be short-lived.

Some of this popularity stems from misconceptions about what MCP uniquely accomplishes, but the majority is due to the fact that it's very easy to add an MCP server. For a brief period, it seemed like adding an MCP server was a nice avenue for getting attention to your project, which is why so many projects have added support.

Tack - Reminders powered by local AI

· One min read

I'm working on an iPhone app called Tack. I have a terrible time remembering things, and have resorted to a patchwork of emails to myself and disorganized notes. I find reminder apps frustrating, the pre-AI ones aren't smart enough, and the AI ones treat every input like an invitation to have a conversation. Tack shoots for a middle ground:

tack

Optimize for Humans

· 2 min read

I recently wrote about optimizing repos for AI, and since then I've been maintaining separate docs for humans (README, contributing guides) and AI agents (.cursorrules, CLAUDE.md, etc.). The problem? I keep writing the same information twice.

Optimizing repos for AI

· 4 min read

A colleague recently complained to me about the hassle of organizing information in AGENTS.md / CLAUDE.md. This is the mark of a real adopter - she has gone through the progression from being impressed by coding agents to being annoyed at the next bottleneck.

When I'm thinking about optimizing repos for agents, I'm looking to accomplish three main goals[^1]:

  • Increase iterative speed: Avoid repeated context gathering, enable the agent to quickly self-correct its mistakes.
  • Improve adherence to evergreen instructions: Over time, repeated agent mistakes emerge. Context within the repo helps the agent avoid these and adopt a more consistent workflow.
  • Help the most agentic agents of them all: Humans and agents scan docs and code in very similar ways, so organizing information so it's easily understood by humans is a good rule of thumb for helping the agents anyways!