What I Built When Chat Stopped Being Enough
The Agentic Studio
You are explaining your project to AI for the third time this week. Not because anything changed, but because it forgot. You paste your notes back in, re-explain what you are working on, re-explain how you want it to sound, re-explain the decisions you already made two sessions ago. By the fifth restart you are spending more time reconstructing context than producing work. By the tenth, you have quietly stopped using AI for anything that matters.
I was there. Except I was sitting on twenty years of material I could not use.
I spent over two decades in academic philosophy. I published dozens of articles and a book, supervised students, held research positions at Chicago and Leipzig. When I left academia for the tech industry, the material came with me the way things come with you when you move countries: boxed, unlabeled, shoved into a corner until you stop seeing the boxes. Reading notes on hundreds of books. Lecture outlines. Seminar transcripts. Half-written papers. A dead website. The activation energy to do anything with it was enormous, and a chatbot did not lower it. I could ask it about Cavell, but it would give me the Wikipedia version, not the version shaped by my twenty years of reading. Without my context, the AI was useless for anything that required depth.
What changed
Something changed when AI tools stopped being conversation partners and became project collaborators. In 2025, tools built for software development (Claude Code, Cursor, Codex, Windsurf, and others) brought a capability into general-purpose AI: the ability to read and navigate structured sets of files before the user types anything. You could stop chatting and start building a workspace.
That shift is connected to something I have been writing about: natural language is becoming the software layer. A style guide used to be a document that humans consulted. Now it is an instruction set that directly determines machine behavior. A task list used to be a reminder for the author. Now it is a behavioral gate: the AI reads the status markers and decides what to work on, what to skip, and what to flag as blocked. The words in these files are not descriptions. They are programs. And like any program, their precision determines the quality of the output.
Once I understood this, the activation energy collapsed. I could take twenty years of notes, structure them into something a machine could navigate, write a set of instruction files in natural language, and have an AI that understood my project before every session. I call this workspace the Agentic Studio.
What the setup looks like
The architecture is a structured folder of text files that lives on your computer. You open the folder with an AI tool, the tool reads the files inside, and from that point on it has your project's context before you say a word. (Technically a Git repository with Markdown, but if that means nothing to you, "a structured folder of text files" is accurate enough.)
agentic-studio/
├── CLAUDE.md ← Operating instructions: the AI reads this first
├── STYLE.md ← Voice rules, accumulated over dozens of sessions
├── INDEX.md ← Reading order: what to load, in what sequence
├── TASKS.md ← Active work, backlog, and done log
├── SESSION-REVIEW.md ← End-of-session protocol
│
├── knowledge-base/
│ ├── memory/
│ │ ├── projects/ ← Per-project strategy, decisions, context
│ │ └── people/ ← Collaborator profiles, communication styles
│ ├── docs/
│ │ ├── research-db/ ← Structured reading notes, cross-referenced
│ │ └── references/ ← Source material organized by topic
│ └── sources/ ← Raw inputs: transcripts, exports, annotations
│
└── projects/
├── essay-series/ ← Drafts, version history, editorial notes
└── website/ ← Site content, publication pipelineInside the knowledge base, all my old academic material now lives, restructured into something a machine can navigate. Reading notes organized atomically. Book summaries. Lecture fragments. Research databases cross-referencing Girard, Cavell, Kripke, Harari, and a few hundred other sources.
At one point I ran embeddings over the whole corpus to make it semantically searchable. That experiment taught me something I did not expect. The semantic search itself was fine, but the real value was upstream: preparing the material for embeddings forced me to clean, tag, and organize years of notes that had been in various states of disorder since 2009. The index I built to structure the corpus turned out to be more useful than the search engine on top of it. I use the index every day. I almost never use the embeddings. The structuring was the point.
I work across Claude (desktop, terminal, Cowork), Cursor, Codex, and Antigravity, often switching between them in the same day. When I am walking or driving, I use OpenClaw through Telegram or WhatsApp: I dictate an insight from a podcast, and an always-on agent searches the knowledge base for connections and files a seed for the next writing session. A different agent handles the deep work: writing, editing, building. Both read the same files. Neither needs to know what the other did; the files carry the context. Sometimes the connection it finds is better than the one I had in mind. Sometimes it is wrong. Either way, by the time I sit down, the idea is filed and waiting. For editing markdown files, I use Antigravity or Cursor, but Obsidian, Typora, or any editor with markdown preview works.
The architecture is tool-agnostic by design. The intelligence lives in the files, not in the client. I started a draft in Claude on my laptop, continued it in Cursor on my desktop, and captured a connection via voice on my phone while driving. Each tool read the same files, followed the same instructions, picked up where the last one left off.
Three practices that make it work
Progressive disclosure
The most counterintuitive lesson: giving the AI more context makes it worse. My instinct was to load everything, the full knowledge base, all the draft history, every note. The output became generic, the voice disappeared into a fog of averaged-out prose, and the AI started hallucinating connections between ideas that do not belong together.
The fix: control what the AI reads and in what order. An index file specifies what to load for each task. The AI reads the project memory, then the current draft, and nothing else unless I point it there.
The problem was never that the AI could not remember enough. It was that I had not decided what each task actually needed. Once I started placing context deliberately instead of dumping it all in, the output transformed. With the full knowledge base loaded, the AI produced sentences like "Cavell's ordinary language philosophy offers a framework for understanding how AI navigates the gap between meaning and use": accurate, fluent, and something I would never put my name on. With only the project memory and the current draft, it produced sentences I could work with. Less context, better output. The constraint is the feature.
Iterative refinement: mentoring the machine
For eleven years I supervised students. The iterative cycle was the real work: you read a draft, identify where the argument breaks, explain why it breaks, and send them back to try again. Then you read the next version, and the next. Each round you push on something different. You do not fix the text. You update the student's internal model of what good work looks like, one revision at a time.
Working with AI in the studio is that same process. This essay went through over a dozen drafts. The moment I remember most clearly: reading v1 and seeing it open with "I spent over a decade in academic philosophy." Technically accurate. But I could feel the reader leaving by the second sentence. Not because the background was irrelevant, but because it was in the wrong place, offered before the essay had given the reader any reason to care. I wrote back: "Start with the problem, not with me." The next draft opened with the chatbot frustration. Five versions later, the credentials appeared in paragraph two, inside the argument, where a reader who already cares will actually absorb them. That single correction taught me more about the difference between a machine that follows instructions and a writer who feels an audience than anything I had read about AI alignment.
Each correction sharpened the next round. But what makes this different from supervising a student is what happens to the corrections. A student internalizes feedback and carries it forward in memory. The machine does not. So after every session, the corrections go into the instruction files. Not the draft: the rules.
What stays behind
My style guide currently includes entries like these:
Never use "genuinely." (The AI used it twice in one paragraph. I deleted it, opened the style guide, and added this rule. It has not come back.)
Do not import a framework onto the material. Let the material generate the argument. (The AI wrote an essay about economics as though it were an essay about Cavell: it imposed my philosophical framework on a subject that needed to find its own structure. I rewrote from scratch and added this rule.)
After three revision rounds, rewrite from first principles. (A twelfth draft had accumulated so many surgical patches that the prose became convoluted. A clean rewrite was faster and better.)
Let one example breathe. Among compressed examples, expand one into a full scene. Stories get retold; lists get skimmed. (The most impactful change in one essay was expanding a one-sentence mention of Versailles into a full scene. It became the most shared passage.)
The style guide is the precipitate: what crystallized out of dozens of iterative cycles. After enough sessions, the AI nails the voice on the first try, because the voice is encoded in the files it reads before it speaks. This is what I mean by natural language as software: these rules are not suggestions. They are instructions the machine executes.
The simplest way to maintain this: when you finish working, tell the AI "session review." It summarizes what happened, updates the task list, and proposes amendments to the style guide. You approve or edit. Five minutes. The system gets smarter after every session.
One thing I learned the hard way: the system also needs pruning. Context accumulates, and stale context is worse than no context. A task list full of completed items misleads the AI about what is active. An instruction file with resolved decisions clutters the reading path. Part of the session review is now asking: what here has expired? The architecture is not "set it up and forget it." It is a living workspace that needs periodic maintenance, the way a garden does.
But the analogy to mentoring breaks at a revealing place. A student who learns "do not import a framework onto the material" eventually develops the judgment to recognize when a framework should be imported, because this particular material needs it. The machine never does. It follows the rule with perfect fidelity, in every situation, forever. The style guide gets tighter after every session, and the tighter it gets, the more it resembles a legal code: comprehensive, precise, and incapable of knowing when to make an exception. The judgment to override a rule you wrote yourself is the one thing you cannot encode in the file. That is not a flaw in the system. It is the system working as designed: the architecture handles everything that can be made explicit, and the remainder is yours.
(I am developing this idea further in an essay with the working title "Advising the Machine," about what the philosophy of pedagogy can teach us about working with AI, and what AI reveals about what teaching always was.)
Judgment as the real work
AI generates enormous volume. It will also, with unfailing politeness, smooth every rough edge in your prose until it reads like a Wikipedia article with better formatting. The technical term is sycophancy. The practical consequence is that the AI produces fluent, confident, wrong output faster than you can catch it. Your output is no longer text. It is decisions: what stays, what gets cut, what needs to be thrown out and started over, what question to ask next. The bottleneck has moved from doing the work to defining what work is worth doing. I wrote about this at length in "The Economics of Infinite Desire": when production is infinite, the scarce resource becomes judgment and taste. The Agentic Studio is where that abstract problem becomes concrete: every session, you sit in front of a machine that can produce anything, and your only job is to know what is worth keeping.
What this produced
The last code I had written before leaving for philosophy was plain HTML in a text editor. Within a few weeks, working evenings and weekends, the AI had scaffolded a full Next.js site with a custom markdown parser, a deployment pipeline on Vercel, and a content system where each essay is a plain text file with metadata that drives the entire rendering chain. The instruction files handled the framework; I supplied the decisions about structure, design, and what went on the page. Two of the essays I wrote were linked on Marginal Revolution, Tyler Cowen's economics blog. All maintained by one person with a demanding day job, where the same architecture feeds back and forth between personal writing and professional work.
None of this happened because the AI got it right on the first try. It happened because progressive disclosure made twenty years of Cavell material usable without overwhelming the context window, and because session reviews compounded: each essay draft sharpened the style guide, and the sharper style guide made the next essay's first draft land closer to the voice, which meant more time on the argument and less on surface corrections. The architecture does not produce the work. It removes the friction that kept the work from happening.
Try it yourself
This piece was written with the process it describes. A seed file in the knowledge base, over a dozen drafts, each round of corrections hardened into the style guide so the next round started closer to the voice. The architecture is not something I am recommending from a distance. It is the infrastructure underneath the words you just read.
I put together a starter template you can use today. Download the folder, open it with any AI tool that reads files (Claude, Cursor, Codex, whatever you prefer), and say: "Read SETUP.md and help me get started." The AI will walk you through four rounds of questions: what your project is about, what material you already have, how you want to sound, and what you are working on right now. ("What is the one thing a reader should take away from your work?" is the question that does the most heavy lifting.) Then it populates the operating instructions, the style guide, the index, and the task list. Five minutes. No coding.
After setup, the template includes guided workflows for organizing existing material, drafting from a seed idea, running a revision session, and processing voice captures. Each one is a structured instruction file the AI follows step by step.
Within a couple of weeks of regular use, you will notice the difference: first drafts that used to be unusable will need only structural edits, and the time you spend re-explaining context will drop to nearly zero. After ten sessions, the style guide will start catching things before you notice them. After thirty, the AI will nail your voice on the first try, because the voice is no longer in your head. It is in the files. It is, for the first time, shareable.
And that is when you will notice the strange thing: the better the system gets at sounding like you, the more clearly you see the part of the work that was never about the words. The style guide handles the voice. The index handles the context. The session reviews handle the memory. What is left is the thing no file can hold: knowing what matters, and why, and to whom. The architecture does not
If you want essays like this in your inbox, subscribe.
Related essays