Quick notes on llms (as of Feb '26)

Claude Opus 4.6 and Codex 5.3 just released.

My experience so far is: claude is much slower than codex. Claude code kinda sucks - it often just hangs for a really long time before starting to do anything. We're talking a 10 minute hang before actually receiving any tokens. Meanwhile, codex has got wayyy faster. I can't help but regret my recent purchase of claude max (even though it's half the price of codex.) 

I'm going to have to do a lot of testing between these two models before my subs run out.

As for workflows: I keep getting bitten by the temptation to let the agents take on a large chunk of work. I've had this problem in the past and noted that it's often very unsuccessful, yet I'm still making these mistakes. There was a recent post by Mitchell Hashimoto about his experience working with agents. I think his discussion is one worth repeating to oneself often.

I've softened my stance on spec/document driven workflows with agents. The compromise is that agents ought to consume hand-written documents, i.e. you should always review documents written by LLMs and either prune / edit them or write a fresh document based on their findings. They do tend to be a decent way to consume output from the agents though.