Coding agent in 94 lines of Ruby

(radanskoric.com)

100 points | by radanskoric 2 days ago

11 comments

  • ColinEberhardt 1 hour ago
    Great post, thanks for sharing. I wrote something similar a couple of years ago, showing just how simple it is to work with LLMs directly rather than through LangChain, adding tool use etc …

    https://blog.scottlogic.com/2023/05/04/langchain-mini.html

    It is of course quite out of date now as LLMs have native tool use APIs.

    However, it proves a similar point to yours, in most applications 99% of the power is within the LLM. The rest is often just simple plumbing.

    • radanskoric 48 minutes ago
      Thanks for sharing this. The field moves so yes, it's out of date, but it's useful to see how the tools concept evolved. Especially since I wasn't paying attention at that area of development back when you wrote your article. Very interesting.
  • thih9 1 hour ago
    > Claude is trained to recognise the tool format and to respond in a specific format.

    Does that mean that it wouldn’t work with other LLMs?

    E.g. I run Qwen3-14B locally; would that or any other model similar in size work?

    • radanskoric 44 minutes ago
      It would work with most other Tool enabled LLMs. RubyLLM abstracts away the format. Some will work better than the others, depending on the provider, but almost all have tool support.

      Claude is just an example. I pulled the actual payloads by looking at what is actually being sent to Claude and what it is responding. It might vary slightly for other providers. I used Clause because I already had a key ready from trying it out before.

    • simonw 55 minutes ago
      Qwen3 was trained for tool usage too. Most models are these days.

      https://qwenlm.github.io/blog/qwen3/#agentic-usages

  • thih9 57 minutes ago
    > return { error: "User declined to execute the command" }

    I wonder if AIs that receive this information within their prompt might try to change the user’s mind as part of reaching their objective. Perhaps even in a dishonest way.

    To be safe I’d write “error: Command cannot be executed at the time”, or “error: Authentication failure”. Unless you control the training set; or don’t care about the result.

    Interesting times.

    • radanskoric 43 minutes ago
      If a certain user is susceptible to having the LLM convince them to run an unsafe command, I fear we can't fix that by trying to trick the LLM. :D

      Either the user needs to be educated or we need to restrict what the user themselves can do.

  • RangerScience 9 hours ago
    This is very cool, somewhat inspiring, and (personally) very informative: I didn't actually know what "agentic" AI use was, but this did an excellent job (incidentally!) explaining it.

    Might poke around...

    What makes something a good potential tool, if the shell command can (technically) can do anything - like running tests?

    (or it is just the things requiring user permission vs not?)

    • radanskoric 2 hours ago
      Thanks, sharing my learnings on how coding agents work was my main intention with the article. Personally I was a bit surprised by how much of the "magic" is coming directly from the underlying LLM.

      The shell command can run anything really. When I tested it, it asked me multiple times to run the tests and then I could see it fixing the tests in iterations. Very interesting to observe.

      If I was to improve this to be a better Ruby agent (which I don't plan to do, at least not yet), I would probably try adding some Rspec/Minitest specific tools that would parse the response and present it back to the LLM in a cleaned up format.

    • tough 8 hours ago
      > What makes something a good potential tool, if the shell command can (technically) can do anything - like running tests?

      Think of it as -semantic- wrappers so the LLM can -decide- what action to take at any given moment given its context, the user prompt, and available tools names and descriptions.

      creating wrappers for the most used basic tools even if they all pipe to terminal unix commands can be useful.

      also giving it speicif knowledge base it can consult on demand like a wiki of its own stack etc

      • notpushkin 7 hours ago
        Also it’s safer than just giving unrestricted shell access to an LLM.
  • fullstackwife 7 hours ago
    This reminds me about PHP hello world programs which would take a string from GET, use it as a path, read a file from this path, and return the content in the response. You could make a website while not using any knowledge about websites.

    Agents are the new PHP scripts!

  • rbitar 9 hours ago
    RubyLLM has been a joy to work with so nice to see it’s being used here. This project is also great and will make it easier to build an agent that can fetch data outside of the codebase for context and/or experiment with different system prompts. I’ve been a personal fan of claude code but this will be fun to work with
    • radanskoric 2 hours ago
      Author here. The code I made took me 3 hours (including getting up to speed on RubyLLM). I also intentionally DIDN'T use a coding assistant to write it (although I use Windsurf in my regular work). :D

      It's clearly not a full featured agent but the code is here and it's a nice starting point for a prototype: https://github.com/radanskoric/coding_agent

      My best hope for it is that people will use it to experiment with their own ideas. So if you like it, please feel free to fork it. :)

  • sagarpatil 2 hours ago
    I don’t understand the hype in the original post.

    OpenAI launched function calls two years ago and it was always possible to create a simple coding agent.

    • radanskoric 2 hours ago
      Author here. The part about coding agents that wasn't clear to me was how much of the "magic" is in the underlying LLM and how much in the code around it making it into an agent.

      When I realised that it's mostly in the LLM I found that a bit surprising. Also, since I'm not an AI Engineer, I was happy to realise that my "regular programming" skills would be enough if I wanted to build a coding agent.

      It sounds like you were aware of that for a while now, but I and a lot of other people weren't. :)

      That was my motivation for writing the article.

  • behnamoh 7 hours ago
    > it's only 400 lines of Ruby

    also the article:

    > it uses RubyLLM

    (which itself is a lot more code added to 400 lines claimed by the author).

    > Ruby leads to programmer happiness...

    Am I the only one who finds Ruby code cryptic and less clear than JS, Python, and Rust?

    As a person who's dabbled a bit in Rust and Ruby, I'm surprised people find Ruby intuitive and simple. For example:

    > def execute(path:)

    why the "incomplete" colon?

    > module Tools

    > class ListFiles < RubyLLM::Tool

    what's with the "<"?

    > param :command, desc: "The command to execute"

    again, the colons are confusing...

    • miki123211 3 hours ago
      Those colons are symbols, and (as a non-rubyist), yes, I think they were a bad idea.

      Ruby symbols (and atoms in Lisp / Erlang / Elixir) are a performance hack, they are basically "interned" strings. They're immutable and are mostly used for commonly-repeated values, e.g. names of keyword arguments, enum values etc. Unlike strings, they're supposed to be treated as a single, opaque value, with no other properties beyond the name they represent. Most languages don't give you APIs to uppercase or slice symbols, though you can usually convert to strings and back if you really need to.

      The performance advantage of symbols/atoms is that they're represented as pointers or offsets into a global pool, which contains their names. This means repeated instances of the same symbol take up much less memory than they would as ordinary strings, as each instance is just a single pointer, and the actual name it represents is only stored once, in the global pool. Comparing symbols is also much faster than strings, as two symbols representing the same name will always have the same offset, turning an O(n) character-by-character comparison into a simple O(1) comparison of pointers.

      It's really strange to me that so many high-level languages, which prize themselves on "developer happiness" and "mental compression" force you to think about strings in this way, out of all things.

      I think there's something to be said for having a single character that can be used to represent identifiers as strings. Giving your internal string type an interned representation (with automatic interning of literals and a method to intern arbitrary strings) is also an interesting idea, though I know far too little about interpreter design to have an opinion on how much performance difference it would make. Conflating those two ideas together just seems like a really bad deal to me.

      • endorphine 1 hour ago
        It just occurred to me that Ruby's symbols are pretty much the same as Rust's string slices. No?
        • jbverschoor 1 hour ago
          No. A symbol is basically a constant such as for example HTTP_NOT_FOUND. The difference is that you don’t assign it a number, which means less decisions during development. Another difference is that it will be printed as the string, so no to_string boilerplate.

          The other colon path: is used for named parameters/keyword args.

          It’s the same syntax as a hash/dict, but as a parameter it does not have a default value in this case.

          The old syntax, but still supported for hash/dict was “key => value”

          Rust’s string slices are the equivalent of Ruby’s slices of an array: myarr[5..18]

    • jeffrwells 6 hours ago
      As someone who writes a lot of Ruby and has for a long time, I have never thought the metric Ruby optimizes for is “intuitive” (I think it is, but been doing it too long / too close to it so it’s intuitive to me)

      The stated optimization from Matz (who created Ruby) is “developer happiness”

      The important optimization for me is “fidelity to business logic”, eg less cruft and ruby syntactic sugar means you could sit at your computer and read your business rules (in code) out loud in real time and be understood by a non-dev

    • nottorp 34 minutes ago
      > > it uses RubyLLM

      > (which itself is a lot more code added to 400 lines claimed by the author).

      Most "X in n (<100) lines of code" projects posted on here mean just that. Import 2 million lines of code from libraries and just count the glue code lines.

      It's marketing or something.

    • nomilk 6 hours ago
      > def execute(path:)

      > why the "incomplete" colon?

      Just ruby's syntax to denote 'keyword arguments', i.e. when calling this method, you must explicitly use the parameter name, like so: execute(path: "/some/path")

      > class ListFiles < RubyLLM::Tool

      > what's with the "<"?

      Just ruby's way of denoting inheritance i.e. class A < B means class A inherits from B

      > param :command, desc: "The command to execute"

      > again, the colons are confusing...

      The colons can be confusing, but you get used to them. The first one in :command is denoting a symbol. You could think of it as just a string. Any method that expects a symbol could be rewritten to uses a string instead, but symbols are a tad more elegant (one less keypress). The second one, desc:, is a keyword argument being passed to the param method. You could rewrite it as param(:command, desc: "The command to execute")

      I sometimes wonder what ruby would be like without symbols, and part of me thinks the tradeoff would be worth it (less elegant, since there'd be "strings" everywhere instead of :symbols, but it would be a bit easer to learn/read.

      • d4mi3n 4 hours ago
        Another couple important aspects of symbols:

        1. They refer to the same object in memory, so multiple usages of a symbol by the same name (e.g. `:param`) not not add additional memory overhead.

        2. Symbols are also immutable in that they cannot be mutated in any way once they are referenced/created.

        These properties can be useful in some contexts, but in practice they're effectively used as immutable constant values.

      • jbverschoor 54 minutes ago
        I disagree. symbols are constants, and strings are text. Very clear distinction, relevant when you have syntax highlighting.

        Working with json/xml, or anything that has keys and text-values for that matter is a lot better with symbols

    • tptacek 7 hours ago
      You could chuck RubyLLM and still golf it down pretty far. Nothing RubyLLM is doing is magic. Read the linked article that inspired it (which is in Go); there's even less magic there.

      Some of the complaints here are just about Ruby syntax, which: shrug.

    • Mystery-Machine 6 hours ago
      > As a person who's dabbled a bit in Rust and Ruby

      You say you "dabbled" a bit in Ruby and then proceed to demonstrate your lack of understading of basic Ruby syntax.

      • kemotep 4 hours ago
        In Dwarf Fortress, the lowest possible skill rank other than no skill is dabbling. What does dabbling mean to you?
        • ryeguy 1 hour ago
          To me, dabbled means learned the fundamentals of the language but hasn't written real software in it. I would expect a dabbler to have been exposed to class inheritance and primitives such ss symbols.
      • behnamoh 5 hours ago
        hence the word "dabbled".
    • BugsJustFindMe 6 hours ago
      > As a person who's dabbled a bit in Rust and Ruby, I'm surprised people find Ruby intuitive and simple

      IMO it's because they mostly just don't think about the chaos they're doing. You encounter a function call that ends in a hash. Does the function being called use that as a hash or does it explode the hash to populate some extra trailing parameters? You have no way to know without going and looking. To me that's super dumb. To them it's just another day in unexpected behavior land.

      • jeffrwells 6 hours ago
        Why the us vs them mentality? How is that productive? Did people who love Ruby do something to you?

        Pretty hard to grow and learn new concepts if you immediately label anything you don’t understand yet as “super dumb”

        • BugsJustFindMe 6 hours ago
          This isn't any new concept to learn. There are real consequences to the other people who need to read and improve the code later to not being able to form rapid linear intuition about the meaning of values being passed around programs without looking somewhere else. It's like reading a garden path sentence where the understanding of the beginning can only exist after working backwards from the end. And there's definitely a distinct dichotomy between developers who detect the discommoding and developers who don't.
      • jaredsohn 5 hours ago
        re: function that ends in a hash

        I assume this is referring to passing hashes in as parameters to methods. Ruby 3 made this more explicit; you must use ** to convert a hash into positional arguments and you must use {} around your key/values if the first argument is a hash.

        • BugsJustFindMe 5 hours ago
          I'm glad the light was eventually seen after 25 years, but that's a significantly breaking change which means the update often doesn't happen. A lot of Ruby 2 codebases still exist with people still doing Ruby 2 bad behaviors.
  • Mystery-Machine 6 hours ago
    Just out of curiosity, I never understood why people do `ENV.fetch("ANTHROPIC_API_KEY", nil)` which is the equivalent of `ENV["ANTHROPIC_API_KEY"]`. I thought the whole point of calling `.fetch` was to "fail fast". Instead of assigning `nil` as default and having `NoMethodError: undefined method 'xxx' for nil` somewhere random down the line, you could fail on the actual line where a required (not optional) ENV var wasn't found. Can someone please explain?
    • radanskoric 1 hour ago
      Author here. You're actually right here.

      I took the code from RubyLLM configuration documentation. If you're pulling in a lot of config options and some have default values then there's value in symmetry. Using fetch with nil communicates clearly "This config, unlike those others, has no default value". But in my case, that benefit is not there so I think I'll change it to your suggestion when I touch the code again.

    • jaredsohn 5 hours ago
      There might be code later that says that if the anthropic api key is not set, then turn off the LLM feature. Wouldn't make sense for this LLM-related code but the concept makes sense for using various APIs from dev.
      • riffraff 49 minutes ago
        But if you do ENV[xxx] the value is also set to nil.

        Using .fetch with a default of nil is what's arguably not very useful.

        IMO it's just a robocop rule to use .fetch, which is useful in general for exploding on missing configuration but not useful if a missing value is handled.

  • paulddraper 6 hours ago
    94 lines if you import thousands of lines and connect to remote services.

    Useless metric.

    • radanskoric 2 hours ago
      Author here. I put the lines of code into the title to communicate to the reader that they can get a good understanding just by reading this article.

      Basically, what I wanted to say was: "Here is an article on building a prototype coding agent in Ruby that explains how it works and the code is just 94 lines so you'll really be able to get a good understanding just by reading this article."

      But that's a bit too long for a title. :)

      Also, a big benefit of Ruby is how concise it is. Yes, there's a library under neath my code but that library is, like Ruby, also very concise. That's kind of the point.

      I can see how you might have read the title as hype-ish, but that was not my intention.

    • andai 2 hours ago
      It's useful for educational purposes, to show how simple it is, and make it easier to understand by boiling it down to its essence.

      The imports indeed take away from that.

      You can also do it without imports in about the same number of lines, if you strip it down a bit.

    • jeffrwells 6 hours ago
      LOC or code golf is indeed not a metric to optimize to 0, but hating on importing libraries is a bad take.

      As CTO at Redo do you demand everything written in assembly? Does your entire company’s code run in one file or do you use abstraction to simplify?

      I’m not quite clear on how expressing a really complex and paradigm shifting approach of agents in a more concise way is a bad thing

    • dismalaf 6 hours ago
      Unless you're writing a kernel from scratch, you're calling lots of code someone else wrote once upon a time...
      • radanskoric 1 hour ago
        Even if you're writing a kernel, you're using a compiler. And if you're writing a compiler it's probably bootstrapped and built by itself, so you're still using a mountain of code some other people wrote. :)