How I use LLMs as a senior engineer on a complex project. Part 1.

The thing about my current project – it’s complex. Really complex. And critical. So critical that when it fails, you might see a blip on a NASDAQ-100. Jokes aside, AI struggles to do any meaningful contribution on its own. Believe me, I’ve tried. So simply asking to “do the task” doesn’t work. I’ve had to come up with some creative ways to get the best out of it. Here are a few.

Use the Best Coding Model

Let’s start with the obvious: the underlying model matters – a lot. To find the best one, check out https://lmarena.ai. Go to Leaderboards –> Language –> Category: Coding. As of May 2025, Gemini 2.5 Pro is leading, but OpenAI o3 works great too.

If your company limits which models you can use (which is often the case due to code ownership or NDA concerns), your options may be fewer—but sometimes that actually makes things simpler.

Make sure the model supports thinking tokens—it significantly improves results. For best outcomes, use an agentic workflow, but this depends on the tool you’re using.

Context is Everything

The model needs to know everything relevant. Think of it like a human engineer lacking key details—they’d ask questions. Models don’t. They just make assumptions. And we all know how great those can be.

In general, context comes from three sources:

  1. The code itself

  2. The task definition (i.e. your prompt)

  3. RAG sources (retrieval-augmented generation—out of scope here)

If the model doesn’t have the information it needs, it simply won’t do a good job.

For backend systems, this might include:

So: provide all the relevant context in the prompt. Spell it out.

Drafting in Parallel (a.k.a. Shadow Coder)

One of my favorite ways to tackle a task is to have an LLM work in parallel with me. I’ll spend 1–5 minutes writing a quick, lightweight prompt (sometimes called “lazy prompting”), toss in some relevant context, and let the model do its thing quietly in the background.

Meanwhile, I focus on another task.

I call this approach Shadow Coder—like a silent partner coding alongside me, often unnoticed until it hands me something useful.

Usually, within 30 minutes to 2 hours, it delivers some code. At that point, one of three things happens:

Either way, it rarely costs me more than 5 minutes to prompt and another 15 to review. That’s a pretty great deal.

Fixing Tech Debt While I’m Too (Lazy) Focused

Sometimes I spot tech debt—like a stale config flag from five years ago—but I’m deep in another task. Normally I’d throw it on the to-do list (and forget it forever).

Now, I just take the exact same note I would’ve written in my task list and paste it into the LLM prompt instead. And surprisingly, that works.

The beauty is that these kinds of refactors often need little more context than the code itself. Success rate for me? Around 50%.

In the Next Part…

Let me know what you’d like me to cover next. Otherwise, I’ll share: