I’ve been using AI agents to assist with software development for at least two years now. GitHub Copilot was kind of where my journey began, but I’ve tried many over the last couple of years. They began with “oh, thank you, you caught a semicolon I missed” and that was about all they were worth. Then they got better and were able to write some level of functions that wouldn’t crash, but sometimes took odd approaches to solutions, missed the greater context, would duplicate features and just generally wasn’t production-level code. Sometimes you could take what they wrote and make it work, but that was often just as much work as writing it yourself.
Fast-forward to this week. I was using Claude-4.5-Opus-High and had a handful of tasks to complete. I’ve been using this model for a few months now and it has been really good, but it felt like something changed this week. Of course this is anecdotal and a sample size of one, but I feel like in the last week or two, we crossed some threshold.
Mutex locking service
When I started working for my new employer, I quickly learned we were using Amazon S3 buckets as a mutex locking service. This served the team well enough for a few years, but was starting to show cracks. I began working on a new mutex locking service exposed via a RESTful API and backed by Redis. All the new code we’re writing that leverages mutex locking now uses this service.
I whipped up the mutex locking service in a matter of a couple of days, including updating our tools to leverage it. It has been working well, but I had some ideas to improve interactions with it. Namely, we often deal with servers in batches of 10 while the mutex service dealt with resource locking service dealt with single units of resource locking. So if you wanted to lock all 10 servers, it would require 10 separate API calls. While we haven’t ran into performance issues yet, batching locks makes a lot of sense for us and was a clear missing feature.
Which brings us to this week. I wanted to update the mutex locking service to support batch operations as well as create some new endpoints to allow the client to lock the batch of 10 servers by referring to their cluster name. Claude (by way of Cursor) helped me implement the new endpoints which included new logic as well as leveraging existing functions and also calls out to external services (we verify servers exist and are in a proper state in our inventory before locking them). I summarized my requirements in Cursor and within about 5 minutes, I had all the new endpoints I needed as well as regressions tests, and when I reviewed the result, I was a bit taken aback.
Having experience in this world, when I have these sorts of tasks, I am able to visualize roughly how I think it should look before writing a single line of code. I know what internal functions already exist that should be leveraged to ensure the code is DRY as well as various other structural pieces and order of operations. The code generated in my Cursor IDE was nearly exactly what I was imaging. No major bugs. No duplicate logic. No odd approaches to solve the problems. There were a few minor things I cleaned up on my human-pass through the code (mostly comments, a couple of personal style preferences, and that’s it), but the code “just worked”, followed the approach I would have taken, and took a fraction of the time I would have taken to write it.
While waxing poetic about this result with my coworkers, one of them had a great quote. He described the current state Claude-4.5-Opus-High as “it’s like having a savant but inexperienced engineer at your fingertips“. This feels like a great description about the current state of “vibe coding”. I’m sure I hit a bit of a perfect use-case to be solved, but this was simply not possible even 6 months ago. And I’m also sure that this isn’t going to take my job (yet), because it still needs supervision. I’m excited to see where we will be in the next 6-12 months!
After my new batch-mutex locking changes were in place, our most common code path went from about 30 API calls per run down to 3. Again, we weren’t fighting any sort of scaling issues yet, but that’s a pretty massive gain!
The Tragedy of Romeo and Juliet
This week, I’ve been listening to a bit of Sonata Arctica. I like a lot of their music, but for some reason, the song Juliet has been on my mind this week. I love the unique pacing in the different movements of the song, the orchestral metal undertones, and the lyrics. I’m sure the story of Romeo and Juliet is well known, so I’m not going to go into details about it here, but the tragedy of the story feels so well captured in this song. It also brings a bit of a different and unique perspective to the story I haven’t heard anywhere else. After Romeo takes the poison and his life is fading, he’s staring into Juliet’s eyes, which is where the lyrics in this song begin.
Knowing full well this is fiction, I still can’t help but mentally put myself into those lyrics. Palpable desperation seeping through weeping eyes while you realize your actions have lead to such a tragic outcome. Misplaced frustration at the lack of understanding while your are paralyzed and unable to communicate. Some very stirring imagery.
I don’t generally love Shakespeare or any of the “great tragedies”, but a handful of them are quite entertaining. The Greeks had an unusual (by today’s standards) way of telling stories to entertain the masses and given that variety is the seasoning in life, I do find them interesting at times. They are a lot for my empathetic heart, though, as I have a hard time not putting myself in their shoes, so they are emotionally taxing on me.