← Back to writing

APR 25, 2026 · 10 MIN READ

AI·Product Management

How to Choose an AI Partner (and Spot the Bad Ones)

Choosing the wrong AI partner has cost the SMBs I've watched more money than choosing the wrong technology. The mistake is hard to see at the time, because vendors are good at sounding credible in pitches. The signals that actually predict whether a partner will ship a working agent are the ones most owners don't know to ask about.

This post is the signal list. The questions to ask, the answers to listen for, the red flags that should end the conversation, and the green flags that should keep you in it.

Why this matters more than usual for AI

In a settled software category, picking a bad vendor costs you time but rarely the whole project. The project still ships, just slower or worse. With AI, the failure mode is different. A bad AI partner produces something that demos well, ships into production, and then quietly fails in a way that's hard to attribute. The agent makes weird calls in week 4, the team loses trust by week 6, and by week 10 the project is dead but nobody's officially called it.

The cost isn't just the partner's fees. It's the team's appetite for AI work for the next 12 months. SMBs that get burned on the first AI project usually don't try again for a long time.

So the partner choice carries more weight than the technology choice or the workflow choice. Get this one right, the other choices have a fair shot. Get this one wrong, the other choices barely matter.

The four signals that predict a real builder

After watching several dozen SMB AI engagements at close range, the partners that consistently shipped working agents had four things in common.

Signal 1: they can describe a reference engagement in technical detail

Ask: "Walk me through the last project you shipped. What did the agent do, what model did you use, what was the integration shape, what was the eval method, and what did it cost?"

A real builder answers this in 10 minutes with specifics. Model name, integration points, eval rubric, a real number for cost. Maybe redacts the client name but not the technical shape.

A slide-deck vendor pivots to "we partner with Fortune 500s on AI transformation". They name-drop platforms. They show a logo wall. They don't describe a single agent in the detail you'd need to evaluate the work.

This question alone filters out a lot of the bad pitches. It's why I lead with it on every reference call I do for an SMB.

Signal 2: they'll commit to a fixed-scope first engagement

Ask: "Could you scope and price a 60 or 90 day first build for me on a specific workflow we'd pick together?"

A real builder says yes, and the structure they propose looks something like: a short scoping phase (1 to 2 weeks), then a fixed-scope build phase (6 to 10 weeks), then a defined handoff. The price is a number, not a range. Maybe with a small contingency line item, but a number.

A slide-deck vendor wants to sell you a 6-month "discovery and strategy" phase first. They can't price the build until "we understand your business". They want a retainer. They get nervous when you push for a fixed price on a specific deliverable.

Real builders are willing to take the risk of a fixed scope because they know how to ship. Vendors who can't ship hide behind hourly billing and indefinite timelines.

Signal 3: they have an answer to the operating ownership question

Ask: "Who operates the agent after handoff, and what's your role in that?"

A real builder has a specific answer. Either they hand off cleanly to a named owner on your side after a defined ramp period, or they have an ongoing operating contract with clear deliverables (eval reports, monthly tuning, model swap support) at a known price. They've thought about this because they've done it.

A slide-deck vendor has a vague answer. "We'll make sure the team is supported." "We provide ongoing partnership." They haven't thought about operating because they haven't actually operated agents in production for clients.

This is the question that separates partners who have shipped from partners who have only built proofs of concept. Operating is the part that's only visible after the build ships, and only the partners with real production experience treat it as a first-class concern.

Signal 4: they can walk you through an evaluation harness

Ask: "How do you know the agent is working in production? Show me the eval method."

A real builder describes a specific evaluation method. A fixed dataset of representative inputs. A scoring rubric (could be human review, could be LLM-as-judge with calibration). A tracked metric over time. Alerts when the metric moves. Maybe they show you the dashboard from a prior engagement.

A slide-deck vendor talks about "monitoring" in the abstract. "We have observability." "We track KPIs." When you push for the specific eval method, the answer gets vague.

Evaluation is the most under-rated discipline in AI builds. The partners who do it well are the ones who've felt the pain of not doing it. Ask the question and listen for whether they've felt that pain.

The red flags that should end the conversation

Some signals are so reliably bad that I tell SMB owners to end the call when they appear.

Multi-year contracts before any working agent has shipped. Real builders are willing to earn the second engagement after the first one works. Vendors who need a multi-year commitment upfront are trying to lock in revenue before you discover they can't deliver.

A "discovery phase" that ends in a strategy document instead of a prototype. Discovery is fine. Discovery that produces only a deck is not. After 2 to 4 weeks of discovery, a real builder is ready to scope and price a specific build. A vendor whose discovery ends in "next, we'd recommend a $200k strategy engagement" is selling you the deck, not the build.

A refusal to give a fixed quote on a tightly scoped first build. If you can describe the workflow in two paragraphs and the partner can't give you a price, they don't know how to scope. Which means the build will overrun. Move on.

Selling you on AI strategy when you asked for an AI build. Sometimes you genuinely need strategy. Most of the time, owners who ask for a build know what they want and the vendor who pivots them to strategy is trying to sell something else. Notice the pivot.

No technical references they can describe in detail. Logo walls aren't references. A reference is a partner who can tell you about a specific build, what worked, what didn't, and what they'd do differently. If every project they describe is anonymized to the point of being uninformative, they may not have shipped what they're claiming to have shipped.

Wild ROI claims in the pitch. 10x ROI in year one isn't real for an SMB AI project. The vendor who's quoting it is either lying or using accounting tricks. I've covered the honest ROI math elsewhere. Anyone whose pitch numbers diverge wildly from those ranges is overselling and you'll feel the consequences six months in.

The green flags that should keep you talking

Equally important, the signals that should give you confidence.

They push back on your scope. Real builders know when you're asking them to build the wrong thing. They'll tell you. The vendor who agrees with everything you propose is either lazy or planning to bill you for the scope changes later.

They want to talk to the operating owner, not just the budget holder. If the partner asks to meet the person who'll run the agent after handoff, take that as a strong green flag. They know operating ownership is what makes the project work, and they want to validate it before signing.

They've shipped a build in your industry or in your shape of business. Industry experience matters less than people think for an AI build, but workflow shape (B2B SaaS support, retail ops, professional services billing) matters more. A partner who has shipped a similar shape will move faster and make fewer scoping mistakes than one going in cold.

They have a small portfolio of prior agents and they'll show you metrics. Not redacted to nothing, not a generic case study deck. Actual error rates, actual hours-saved numbers, actual user adoption curves. Partners willing to share this are confident in their work.

They write down their assumptions and pricing in a one-page proposal. The shorter the proposal, the better the partner. A 40-page response to your RFP is a vendor padding. A one-page scope + price + timeline + assumptions is a builder.

How to run the first conversation

For an owner doing this without a procurement playbook, the structure I recommend.

First 20 minutes: you describe the business, the candidate workflow, and what you hope to ship in 90 days. Don't pitch your problem; describe it in operational detail.

Next 20 minutes: ask the four signal questions above. Listen, don't interrupt. Take notes on the specifics, not the abstractions.

Last 20 minutes: ask three "what happens if" questions. What happens if the agent's error rate doesn't hit target by week 8? What happens if our integration takes longer than expected? What happens if our operating owner leaves halfway through? The answers tell you how the partner thinks about risk.

By the end of the call you should have enough signal to decide whether to take the conversation further. If you don't, the partner didn't give you enough to work with. That's its own signal.

I cover how to de-risk your first AI project more broadly, which includes the partner-selection piece in the context of the full project risk frame. The first conversation is the first place to spot risk, and a structured one beats an unstructured one every time.

A short note on building the influence to push back

The owner-vendor conversation is harder than it should be because owners often don't feel technically credible enough to push back. They worry about looking ignorant. So the vendor's claims go unchallenged and the contract gets signed.

You don't need to be technically credible to push back. You need to ask specific questions and refuse to accept vague answers. That's a different skill. Some of what I've written about influence without authority applies here. The version that's most useful for an AI partner conversation: you hold all the cards in the room because you control the budget. Use it. Ask the specific question, wait for the specific answer, and walk away when you don't get one. The right partners will respect the discipline. The wrong ones will be filtered out by it.

RELATED READING

FREQUENTLY ASKED

What should I look for in an AI consulting partner?
A reference engagement they can describe in technical detail, a fixed-scope first build option, a clear answer on operating ownership after handoff, and the ability to walk you through their evaluation method. If they reach for slide decks instead of specifics on any of these, walk away.
Red flags in an AI vendor pitch?
Multi-year contracts before any working agent has shipped. A 'discovery phase' that ends in a strategy document instead of a prototype. No clear plan for who operates the agent after handoff. Refusal to give a fixed quote on a tightly scoped first build. Selling you on AI strategy when you asked for an AI build.
Should I hire an AI partner or build in-house?
Build in-house if you have a senior engineer with bandwidth and the appetite to learn the LLM tooling. Hire a partner if your team is busy running the business or if you want the first build done in eight weeks instead of six months. Most SMBs are better off with a partner for the first two builds and bringing it in-house once they've shipped.
← Back to writing