Edge cases at runtime.

The hardest problem in every real-world app is the one you couldn't predict. Houston gives the user's AI the authority to handle it on the spot.

The edge case problem

Every real app has edge cases the vendor didn't anticipate.

In traditional apps, the user has three options. Email support and wait. Do without. Take the output to another tool and fix it by hand.

None of these are good.

What Houston does differently

In a Houston app, the user's AI can edit the local code that handles their workflow.

A Python script that generates an export? The AI can edit it. On the user's machine. For this user only.

Next time the user exports, the new code runs. Done.

Add an "exchange rate" column to my exports. Pull it from the transaction date.
Done. Your next export will have an exchange rate column, looked up by each transaction's date. Want me to run a test export on last month's data so you can check it's what you need?
Run a test export Undo

The user never talked to you. You never shipped a custom build. They got what they needed.

What the AI can and can't touch

The AI can only write inside the agent folder: ~/Documents/Houston/{Workspace}/{AgentName}/.

Inside that folder, it can touch:

It cannot touch:

This boundary is enforced by the framework, not by a prompt. The AI isn't asked nicely to stay in its folder. It physically can't leave.

Important

The AI can rewrite the scripts your app ships with. That means the user's copy of those scripts diverges from yours. That's the point — it's how edge cases get handled. Design your app for scripts that are meant to be customized.

Rules as data, not code branches

There's a pattern that makes this work well over time. Keep variance in data, not in code.

The script stays small and general. The rules live in a file like rules.json. The script reads the rules and applies them.

Most of the time, the AI edits the rules file, not the script. The script only changes when the shape of the rules changes, which is rare.

.houston/scripts/ export.py # small, general, edited rarely rules.json # per-client rules, edited by the AI often

Without this pattern, six months of edge cases becomes spaghetti. With it, the script stays clean and the rules file grows into a record of everything the user has asked for.

When a rule repeats across clients, the AI can suggest lifting it to a default:

I've noticed you've asked for the exchange rate column for three clients now. Want me to make it the default for everyone?
Yes, make it the default No, keep it per-client

Tests and rollback

Code changes are higher stakes than rule changes. Houston treats them that way.

When the AI edits a script, three things happen:

  1. The change is committed to git with a plain-English message.
  2. The user sees a one-line summary of what changed, with an undo button.
  3. Where possible, the AI writes a test alongside the change and runs it. If the test fails, the change doesn't land.

Rule changes are eager — they apply immediately. Code changes pause for a beat. That asymmetry is on purpose: rewriting logic deserves more care than rewriting an English sentence.

The unified timeline

Learned rules, rule-file edits, and script edits all show up in the same "what the agent has changed for you" timeline. One place, plain English, undo on every entry. The user doesn't have to understand the difference between a rule and a script edit to roll one back.