ilo: A Programming Language for AI Agents, Not Humans

SudoLang and ilo: Two Opposite Bets on AI Programming

SudoLang and ilo both call themselves AI programming languages. They were both designed around the idea that LLMs need something different from Python or JavaScript. After that, they diverge completely.

SudoLang is a pseudocode language. You write natural-language-ish instructions and the LLM interprets them. There is no compiler, no runtime, no type checker. The LLM is the execution engine.

ilo is a compiled language. You write code, a verifier checks the types, a register VM executes it, and the output is deterministic. The LLM generates ilo code, but never runs it.

Same problem, opposite solutions.

What SudoLang does

SudoLang was created by Eric Elliott and GPT-4 in 2023. The core idea: LLMs already understand pseudocode, so give them a structured pseudocode with just enough syntax to be unambiguous.

ChatBot {
  State {
    topic = ""
    history = []
  }
  Constraints {
    Respond in the user's language
    Keep responses under 200 words
  }
  /help - list available commands
  /topic [t] - set the current topic
}

That’s a valid SudoLang program. No function bodies, no implementation. The LLM infers what /topic should do based on the state definition and constraints.

Key features:

  • Constraint-based programming. You declare rules in natural language. “Keep responses under 200 words.” The LLM is expected to continuously respect them.
  • Interfaces define state and behaviour without implementation. Types are inferred.
  • Pipe operator (|>) for function composition.
  • Pattern matching with semantic inference. (player died) => extract penalty, respawn works because the LLM knows what those words mean.
  • No runtime. The LLM prompt is the execution environment.

What ilo does

ilo takes the opposite approach. The language is fully specified with formal grammar, static types, and deterministic execution.

fac n:n>n;<=n 1 1;r=fac -n 1;*n r

That’s factorial. Prefix notation, no parentheses, semicolons between statements. The verifier checks types before execution. If you pass a text where a number is expected, you get ILO-T004 with a source location and a suggestion.

Key features:

  • Prefix notation. +a b instead of a + b. Saves 22% tokens across expression patterns by eliminating parentheses.
  • Short names everywhere. Builtins are 2-4 characters: len, srt, flt, hd, tl.
  • Static verification. Type errors, undefined variables, arity mismatches caught before execution.
  • Result types. R ok err with ! auto-unwrap and match arms for error handling.
  • Register VM. Compiles to bytecode. Deterministic execution, same input same output.

Where they disagree

SudoLang and ilo disagree on a basic question: should the LLM be the runtime, or should the LLM be the code generator?

SudoLang says: the LLM is smart enough to be the computer. Don’t waste time specifying implementation. Declare what you want, add constraints, and let inference handle the rest. Eric Elliott puts it directly: “The AI is usually smart enough to figure out the ‘how’ for you.”

ilo says: the LLM is good at generating code, bad at being a computer. LLMs hallucinate, lose track of state, and produce different outputs on the same input. Give them a small target language with verification, and let a real runtime handle execution.

The split produces different failure modes.

Where SudoLang breaks

SudoLang inherits every problem LLMs have. If the model misinterprets a constraint, there is no error message. It just does the wrong thing. Constraints like “keep responses under 200 words” are aspirational, not enforced. The model might respect it 90% of the time, or 70%, or not at all depending on context length and prompt position.

State management is the bigger issue. SudoLang programs define state in interfaces, but that state lives in the LLM’s context window. There’s no actual variable binding, no memory, no persistence. The model tracks state through attention patterns, which degrade as conversations get longer.

You also can’t test SudoLang programs. There’s no assertion framework because there’s no deterministic output. The same SudoLang program run twice might produce different behaviour.

Where ilo breaks

ilo’s problems are different. The syntax is dense and unfamiliar. fac n:n>n;<=n 1 1;r=fac -n 1;*n r is not something a human reads comfortably. That’s by design (the audience is LLMs, not people), but it creates a learning curve for anyone who needs to review generated code.

The prefix notation has real parsing ambiguity. *n fac -n 1 looks like it should multiply n by the result of fac(n-1), but the parser reads fac as an atom operand of *, not a function call. You need the bind-first pattern: r=fac -n 1;*n r. An LLM has to learn this rule. Most do after seeing the spec, but it’s a source of generation errors.

ilo also can’t do what SudoLang does well. “Respond in the user’s language” is trivial in SudoLang. In ilo, you’d need to implement language detection, translation, and response formatting from scratch. ilo handles computation. SudoLang handles vague intent.

Token efficiency

Both claim token savings. SudoLang cites 20-30% fewer tokens than natural language prompts, based on research showing pseudocode prompts improve response scores by 12-38%.

ilo’s claim is different. It measures total tokens across the full cycle: spec loading, code generation, error feedback, and retries. Prefix notation saves 22% of expression tokens. Short builtins save characters. But the bigger saving comes from verification. A type error caught by the verifier costs one error message. The same error caught at runtime (or worse, producing wrong output silently) costs a full retry cycle.

The comparison isn’t apples-to-apples. SudoLang saves tokens on the prompt side. ilo saves tokens on the generation + retry side.

Different tools for different jobs

SudoLang works best for chatbot personas, interactive applications, and anything where the desired behaviour is easier to describe than to implement. If you’re building a text adventure game and want the LLM to manage quest state, SudoLang’s constraint system is a good fit.

ilo works best for computation, data pipelines, and tool-calling workflows where correctness matters. If you’re processing a CSV, calling APIs, and writing results to a file, you want deterministic execution and type checking.

Both projects start from the same observation: general-purpose languages like Python and JavaScript are a poor fit for AI agents because they cost too many tokens. SudoLang and ilo drew opposite conclusions about what to do about it.

SudoLang went higher, with less syntax and more natural language so the model fills in the gaps.

ilo went lower, with stricter structure so the model fills in as little as possible.

I built ilo because I don’t trust LLMs to be reliable runtimes. SudoLang’s insight is that the boundary between “code” and “instructions” blurs when your execution engine understands English, which is useful for some workloads and dangerous for others depending on whether the output has to be reproducible.