Making my Own Agent, Part 1: The what and why
I've been building my own agent harness for a few months and figured it was time to write about it. I don't have all the answers, but I learned a lot from other people's process posts and wanted to add to that pile. This is part one.
The foundation:
Okay with that in mind I set out on a trek to understand the inner workings of how an agent works. I found a tutorial from Thorsten Ball of amp code, and it was very insightful. I don't want to recount everything he said in that blog post but I do want to highlight that he shows how easy it is to make an agent in only 400 lines of Go code — maybe less if you optimize a little. It is all very readable on the most basic level. And nowadays, models are designed to be agentic. That reminds me of a discussion I had with a few people in a local Slack group. I remember part of where we discussed how, ultimately, an LLM nowadays is an agent — you give it some tools and they just call it. It doesn't take that much work. Seriously, you give it a function like this:
var ReadFileDefinition = ToolDefinition{
Name: "read_file",
Description: "Read the contents of a given relative file path. Use this when you want to see what's inside a file.",
InputSchema: ReadFileInputSchema,
Function: ReadFile,
}
and to actually call it:
conversation := []anthropic.MessageParam{
anthropic.NewUserMessage(anthropic.NewTextBlock("What's in main.go?")),
}
message, err := client.Messages.New(ctx, anthropic.MessageNewParams{
Model: anthropic.ModelClaude3_7SonnetLatest,
MaxTokens: int64(1024),
Messages: conversation,
Tools: []anthropic.ToolUnionParam{
{
OfTool: &anthropic.ToolParam{
Name: ReadFileDefinition.Name,
Description: anthropic.String(ReadFileDefinition.Description),
InputSchema: ReadFileDefinition.InputSchema,
},
},
},
})
And the model calls it when it's appropriate. The things you do have to worry about are first the logic of the tool — is it going to the right API to get the weather — and the description and JSON input schema. How does an agent know when to call a tool? You load the description and input schema when the model or agent starts up, and as it's working through whatever it's working on, it knows in its context: hey, these are the tools you have access to. And let's say you want to get the weather — it knows it has access to the getWeather tool, and that it needs to input a ZIP code.
What I am doing
A lot of agents have been done 100 times over with minor differences, some use Go, others use Rust, some are meant for agentic coding while others are meant for executive work, so I questioned what can I do that's different and actually useful to me? At first I kept building a coding agent and it truly made me appreciate the thought and engineering that goes behind these harnesses. By harness I mean the scaffolding around the model: the loop, the tools, the prompting logic. The glue that makes it useful. For example, did you know by optimizing the way an agent reads and edits a file it can increase the edit tool call success rate by anywhere from just below 10% to over 50% improvement? These types of experiments and learnings are why I like building my own agent harness. The boring troubleshooting, the checklist stuff — I'd rather have an agent handle that so I can spend more time building tools and working with other teams on harder problems.
Knowing all of the code agents out in the open, I didn't want to create yet another — especially since the current agents are good enough and better than what I believe I can create. I decided to go down the rabbit hole of doing an AI-assisted troubleshooter or, as some companies are calling it, "AI SRE". There are many reasons why I chose to go down this path. Right now we spend a lot of time doing support for things that are easy to diagnose but the time it takes across multiple investigations really adds up. Even if the agent can't find the exact problem it can eliminate a few possible causes. Another thing a troubleshooting agent allows for is being able to have an extra set of hands — you can look at the p0/p1 issue while the agent looks at the p2/p3 issues. And there doesn't seem to be a good open-source one, at least when I started building mine in December of 2025. Sure, you had a few companies building one as their offering and since then you have cloud providers like AWS and Google Cloud creating one for their cloud, so you can't be cloud agnostic. Getting locked into one provider's implementation of something I want to own didn't sit right with me.
So am I building my replacement? I don't believe so. I think I'm building a teammate or coworker who can help when things get dire. In my limited time that I've been using the agent it hasn't always worked — sometimes it churns through tokens going the wrong direction. This of course can be optimized through the harness and various techniques but I don't think these quirks can be completely removed.
What's Next?
In upcoming articles I'll talk about what I've learned, experiments I've conducted, and other learnings. For example, in the future I have plans to allow the harness to be multi-model and even potentially from different providers. Other things I plan on writing about are optimizations to help not only improve the harness but also optimize the cost.
If you're working on something like this or have a good use case for it, let me know in the comments.
Comments
Post a Comment