Home » Blog » How To Think About AI-Assisted Development

How To Think About AI-Assisted Development

It seems like everyone is doing AI-assisted development these days. For a lot of people, it feels like magic: you ask for a thing, and suddenly you have that thing, even if you have no idea how to make that thing yourself. Sometimes that’s exactly what you need. Other times it’s like a half-finished addition attached to your house with drywall screws and optimism.

Here’s the mental model I recommend to make the right decisions about using AI to assist development:

The human developer is the general contractor; AI is their crew of subcontractors.

If you’ve ever done work without a good general contractor, you already know why having an experienced one matters.

The IKEA Chair vs. The House

Even if you’ve never built anything before, you can still put together a chair from Ikea. The parts are pre-cut, the instructions are simple, and if you mess up, the worst outcome is you swear a little and redo a step. That’s vibe-coding at its best: quick, scrappy, exploratory. There’s a place for that, for sure.

But if you’ve never built a house before and you try to “learn by building” with no plan, you’ll do things in the wrong order, make irreversible mistakes, and end up redoing work constantly. I once did a bunch of drywall work to save money. When I called in the experts to mud and tape it, the result was so bad they had to do extra work to fix it and it ended up being more expensive than just hiring it all out in the first place.

The same dynamic shows up in software. Research shows that AI-assisted coding goes fast but sometimes creates rework that eats up all those gains. Prompt until it compiles, prompt until the tests pass (if there even are tests), prompt until the demo works. That approach is fine for small scripts, hobby projects, spike solutions, or throwaway prototypes. But if you’re building a real production system with real users, real security concerns, and a future you have to live in, vibe-coding without experienced oversight isn’t bold. It’s expensive. You just don’t get the invoice until later. Security vulnerabilities are the most expensive version of that deferred invoice; a hobbyist app that accidentally exposes user data isn’t just technical debt, it’s a serious liability

The General Contractor Model

A general contractor doesn’t usually swing the hammer much themselves. They aren’t framing, wiring, plumbing, or roofing the house. Their real job is higher-level, just like an AI-orchestrating developer. They:

Translate requirements into a plan
Sequence work to avoid rework
Set guardrails so people don’t do the wrong things
Keep the site and project organized
Inspect what gets built before it’s too late to change
Manage change orders from the customer
Deliver a finished house, not one that just looks okay from a distance

AI is a powerful crew of subcontractors. But subcontractors don’t replace the general contractor. They make a good GC faster than they’d be working alone. A bad GC with more subs just creates chaos at a faster rate.

Where the Analogy Holds Up

Here are a few ways this metaphor maps cleanly onto your team’s approach to AI-assisted development.

Sequencing

If you do things out of order, you get rework. Lots of it. In software, sequencing shows up as introducing seams before refactoring, agreeing on contracts before wiring integrations, setting up CI and linting before accelerating output, and planning data migrations before piling features on top.

AI will happily help you do the wrong step extremely quickly. The GC mindset is: “What’s the right next step so we don’t pay twice?”

Remodel vs. New Construction

AI shines on greenfield projects because there are fewer hidden constraints. You’re building new walls, not discovering mystery wiring behind old ones.

Large legacy codebases are remodels. They have inconsistent patterns, tribal knowledge, leaky abstractions, and scary corners that only exist because of production reality. Context size matters top. The model can’t “see” the whole codebase at once, unlike the developer who holds a mental map of the system.

So the GC move is to reduce how much the AI has to guess about, just like you would with a subcontractor: give tight scopes, provide the relevant interfaces and constraints, anchor changes in tests and contracts, and work in smaller, bounded slices.

One Trade at a Time

A roofer is not your electrician. Stop asking the model to “implement the feature, refactor the architecture, improve performance, and make it secure” in a single prompt. Instead, treat the model like a trade:

“Generate a migration plan and the SQL scripts.”
“Write tests that cover these edge cases.”
“Implement this function behind this interface with these constraints.”
“Review this PR against our security checklist.”
“Propose two approaches and list tradeoffs.”

One task. One clear definition of done.

Change Orders and Scope Creep

Construction deadlines get wrecked by endless “small tweaks.” Software does too, especially when AI is involved. Every time you say “actually also…” in a prompt, you’re issuing a change order. That’s not inherently bad, but an experienced GC knows to re-baseline the plan when scope shifts. Vibe-coding just keeps hammering and three “small tweaks” later, the model has quietly restructured something it shouldn’t have touched.

Where the Analogy Breaks Down

No metaphor is perfect, and it’s worth being honest about the gaps. Software is far more malleable than a building. You can undo mistakes relatively cheaply sometimes, reverting a commit is easier than un-pouring concrete. And unlike physical construction, you can spin up an identical copy of your “building” in seconds for testing.

The metaphor also understates how collaborative the relationship with AI can be. A subcontractor doesn’t help you think about the design, but AI genuinely can. Using it to explore approaches, pressure-test ideas, and draft alternatives is one of its highest-value uses, and that doesn’t map neatly onto the GC/sub relationship.

Still, the core insight holds: someone needs to be directing the work, sequencing it, and inspecting the results. That job doesn’t go away just because the tools got faster.

What This Actually Looks Like: A Walkthrough

Everything above is the theory. Here’s what it looks like in practice. I’ll walk through a single, realistic task: adding a new API endpoint to an existing service. This is the way I might actually approach it with AI as my subcontractor.

The task: our team needs a new POST /api/orders/{id}/refund endpoint on an existing .NET order management service. It needs to validate the order is eligible for a refund, process it through our payment provider, update the order status, and emit a domain event.

Before I Open the AI: Scoping the Work

I don’t start by prompting. I start by writing a short spec for myself: a few sentences I’d write on a sticky note if I were handing this to a teammate. This takes maybe five minutes:

What success looks like: A new endpoint that processes refunds for eligible orders, integrates with Stripe, updates order state, and emits an OrderRefunded event. Behind a feature flag. Covered by unit and integration tests.

Constraints: Follow our existing CQRS pattern (command handler + validator). Use the IPaymentGateway abstraction; don’t call Stripe directly. Orders can only be refunded if they’re in Completed status and less than 30 days old.

Non-goals: No partial refunds yet. No UI changes. No changes to the existing order state machine beyond adding the Refunded status.

This is my blueprint. Everything I ask the AI to do will trace back to it. If I can’t articulate what “done” looks like, I’m not ready to delegate, I’m just hoping.

Step 1: Lay the Foundation Before Building Walls

The vibe-coding version of this task would be a single prompt: “Add a refund endpoint to my order service.” The AI would produce something plausible, probably with a dozen decisions baked in that I didn’t make and may not agree with.

Instead, I sequence the work the way a GC would schedule trades. Electricians before drywall. Foundation before framing.

My first prompt isn’t about the endpoint at all. It’s about the test:

“Here’s our ProcessOrderCommand handler and its unit tests [pasted]. Write unit tests for a new RefundOrderCommand handler with these rules: the order must be in Completed status, must be less than 30 days old, and must not have been previously refunded. Test each validation rule independently. Follow the same test patterns and naming conventions you see in the existing tests. Don’t write the handler yet, just the tests.”

Why tests first? Because they force me to think through the behavior before the implementation exists, they give the AI an unambiguous target to hit when I ask it to write the handler, and they give me a way to verify the next step actually works. I’m laying foundation.

I review the tests. The AI nailed the pattern matching but missed a case: what happens if the order doesn’t exist at all? I ask it to add that test. Two minutes of review saved me from discovering a missing null check in production.

Step 2: Build the Walls

Now I ask for the handler, and the prompt is tight because I’ve already set up the guardrails:

“Now write the RefundOrderCommandHandler. It should pass all the tests you just wrote. Use our existing IPaymentGateway interface for the actual refund call. Here’s the interface [pasted]. Add a Refunded status to the OrderStatus enum. Emit an OrderRefundedEvent using our existing IDomainEventPublisher. Here’s that interface [pasted]. Don’t add any new dependencies or packages.”

Notice what I’m doing: I’m giving it the interfaces it needs to code against, telling it what it can’t change, and pointing it at the tests it needs to pass. This is a bounded work order. One trade, one job, clear definition of done.

The output comes back. I read it. All of it! It looks clean, follows the pattern. But I notice it’s catching a generic Exception from the payment gateway instead of the specific PaymentFailedException our gateway throws. Small thing. Easy to miss if you’re skimming. I ask it to fix that specific issue.

Step 3: Wire It Up

My next prompt is the API controller and route:

“Write the API endpoint POST /api/orders/{id}/refund that dispatches the RefundOrderCommand. Follow the same pattern as our existing CompleteOrderController [pasted]. Include the feature flag check using our IFeatureFlagService. The flag name should be enable-order-refunds. Return 404 if the order isn’t found, 409 if it’s not eligible, and 200 with the updated order on success.”

Again, this is one job, the existing patterns are provided, and specific status codes are defined. The AI doesn’t have to guess at any of this.

Step 4: Inspection

The code is written. Now I inspect it and I use the AI to help me do it.

“Review the RefundOrderCommandHandler for potential issues: thread safety, error handling gaps, missing validation, and anything that violates the patterns in the existing handler I showed you earlier. Be specific.”

It flags something I hadn’t considered: there’s no idempotency check. If the endpoint gets called twice, it’ll try to refund twice through Stripe. Good catch. I ask it to add an idempotency guard and a test for it.

I run the full test suite. I run our linter. I check that CI is green.

Step 5: The Punch List

The feature works. But “works” isn’t necessarily “shippable.” I have a few more prompts:

“Write a brief runbook entry for the refund endpoint: what it does, how to enable/disable the feature flag, what to check if refunds start failing, and how to manually verify a refund in Stripe’s dashboard.”

“Add structured logging to the handler. Log when a refund is initiated, when it succeeds, and when it fails, including the order ID and the reason for failure. Follow our existing logging patterns [pasted].”

Docs, observability, operational readiness. The punch list is unglamorous but it’s the difference between “it works on my machine” and “we can support this in production at 2 AM.”

What Just Happened

The whole process might have been six or seven prompts across an hour. A vibe-coded version might have been one prompt and fifteen minutes. But look at what the sequenced approach produced: tests that were written before the code, a handler that follows existing patterns and passes those tests, an endpoint behind a feature flag, an idempotency guard I might have missed entirely, structured logging and a runbook, and a clean PR that a reviewer can actually follow.

Every prompt was one job, scoped tightly, with the context the AI needed and nothing more. I wasn’t typing more code, I was directing the work and inspecting the results. That’s the general contractor model in practice.

Experience Is the Multiplier

If you hire a less-experienced general contractor who has only done carpentry, they might be talented and hardworking and still run the project poorly. They won’t know what order things need to happen in, where hidden risks live, what steps you can’t skip, or which shortcuts create future disasters.

It’s the same with AI and less-experienced developers. AI can amplify inexperience. A junior developer with AI might produce more code than ever before, but “more code” was never the goal. You want correct outcomes, properly integrated, with a future you can maintain.

An experienced developer using AI looks like a wizard, not because they type faster, but because they know which questions to ask, which steps to sequence first, and which output to reject. The AI didn’t give them that judgment. It just gave them leverage to apply it faster.

Bring It Home

If you’re building IKEA furniture software, like a proof-of-concept or a throwaway prototype, vibe-code away. Optimize for speed and learning. But if you’re building a house you have to live in, be the general contractor. Plan the work. Scope it tightly. Set guardrails. Inspect everything. And remember that AI is the fastest crew you’ve ever had, but a crew without a contractor just builds you an expensive mess, faster.

The developers who thrive in this era won’t be the ones who prompt the fastest. They’ll be the ones who direct the work the best.

Jonathan "J." Tower

Jonathan, or J. as he's known to friends, is a founding partner of Trailhead Technology Partners, a custom software consulting company with employees across the U.S., Europe, and South America. He is Vice President of the .NET Foundation, a 12-time recipient of the Microsoft MVP award for his work with .NET, and a frequent speaker at software conferences around the world. He doesn’t mind the travel, though, as it allows him to share what he's been learning and also gives him the chance to visit beautiful places like national parks—one of his passions. So far, he's visited 58 of the 63 U.S. national parks. J. is also passionate about building the software community. Over the years, he has served on several non-profit boards, including more than a decade as president of the board for Beer City Code, Western Michigan's largest professional software conference. Outside of work, J. enjoys hiking, reading, photography, and watching all the Best Picture nominees before the Oscars ceremony each year.

Free Consultation

Sign up for a FREE consultation with one of Trailhead's experts.

"*" indicates required fields

Our Gear Is Packed and We're Excited to Explore With You

Ready to come with us?

Together, we can map your company’s software journey and start down the right trails. If you’re set to take the first step, simply fill out our contact form. We’ll be in touch quickly – and you’ll have a partner who is ready to help your company take the next step on its software journey.

We can’t wait to hear from you!

Main Contact

Together, we can map your company’s tech journey and start down the trails. If you’re set to take the first step, simply fill out the form below. We’ll be in touch – and you’ll have a partner who cares about you and your company.

We can’t wait to hear from you!

About Us

Our Team

Core Values

Alliances

Awards

Join Us

What We Do

Services

Technologies

Industries

Insights

Blog

Podcast

Events

Courses

How To Think About AI-Assisted Development

The IKEA Chair vs. The House

The General Contractor Model

Where the Analogy Holds Up

Sequencing

Remodel vs. New Construction

One Trade at a Time

Change Orders and Scope Creep

Where the Analogy Breaks Down

What This Actually Looks Like: A Walkthrough

Before I Open the AI: Scoping the Work

Step 1: Lay the Foundation Before Building Walls

Step 2: Build the Walls

Step 3: Wire It Up

Step 4: Inspection

Step 5: The Punch List

What Just Happened

Experience Is the Multiplier

Bring It Home

Jonathan "J." Tower

Free Consultation

Sign up for a FREE consultation with one of Trailhead's experts.

Related Blog Posts

Backlog Zero: How AI Can Clear You Development Backlog

From Idea to Production in 4 Hours with AI: A Case Study

With AI Writing Code, What Are Developers For?

Our Gear Is Packed and We're Excited to Explore With You

Main Contact

Montage Portal

Project Background

Logistics

Custom App and Software Development

Cloud and Mobile Applications

User Experience and Interface (UX/UI) Design

DevOps

Technologies Used

Expertise

Our Gear Is Packed and We're Excited to Explore with You

Thank you for reaching out.