I've sat through a lot of AI SDR demos. They all follow the same script. The sales rep pulls up a prospect — someone from a named account you'd recognise, with a recent funding round and a clear buying trigger. The AI surfaces three signals. It drafts an email that sounds like a human wrote it on a good day. The rep says "and this runs for every prospect in your pipeline, automatically."
Then you sign. You upload your list. The research comes back thin. The emails are interchangeable. The reply rate is 0.5% and nobody can explain why it's worse than what you were doing before with templates.
The demo wasn't lying, exactly. It was just showing you the best-case scenario on a hand-picked prospect. What it wasn't showing you was what the tool does with your actual ICP, at scale, on prospects without obvious signals. That's the product you're buying. The demo is a highlight reel. Your trial is the season.
The Demo Problem
AI SDR vendors optimise their demos the same way SaaS analytics tools do: they pre-load data that looks good. The prospect in the demo is well-documented online. There's recent news. There's a LinkedIn update from last week. The research layer has plenty to work with.
Your average target prospect is not that. They haven't posted on LinkedIn since 2023. Their company doesn't issue press releases. The last funding round was four years ago. The only available signal is that they match your ICP criteria. What does the tool do then? In most cases, it generates a vaguely personalised email based on their job title and company industry, dressed up with enough sentence variation to avoid looking like a template.
That's not a product failure — it's an honest reflection of what "AI research" usually means. The failure is in not showing you this during the evaluation.
5 Questions That Expose Weak AI
Sales reps hate these. Which is exactly why you should ask them.
-
1
What data sources does the research layer actually pull from?
Not a marketing answer — a technical one. LinkedIn, news, job postings, funding databases? Does it pull live or cached? How frequently is data refreshed? "Proprietary AI" is not a data source. If they can't name the sources, the research layer is thinner than the demo suggests.
-
2
What happens when a prospect has no recent signals?
This is the real test. About 60–70% of any realistic B2B list has no obvious buying trigger right now. What does the tool produce for those contacts? A good answer is "we surface that they're low-signal and let you decide whether to proceed." A bad answer is "our AI still generates a personalised email" — because what they mean is it generates a template with company name and title inserted.
-
3
Can I see the reply rate for customers with an ICP similar to mine — not your best case study?
Case studies are selected survivors. Ask for average reply rates across their customer base, broken down by industry and ICP. Ask what the distribution looks like — not just the top quartile. If they only have case studies and no aggregate data, that's informative.
-
4
Does anything send without human approval, ever?
Some tools have "autopilot" modes buried in the settings. Others send follow-up sequences automatically after a first touch is approved. Know exactly what triggers an automated send and what requires a human decision. If the answer is "you can configure it to fully automate," you're looking at a volume machine with an opt-out, not an AI assistant with a human in the loop.
-
5
Can I run the trial on my own prospect list, not your demo data?
If a vendor won't let you test with your actual prospects, that is the answer. The whole point of a trial is to see the product behave with your real-world inputs. A trial limited to their sandbox data is a longer demo, not a test.
What "AI-Powered Research" Actually Means
The phrase is in every AI SDR pitch deck. What it typically means varies enormously.
| What vendors claim | What it usually is | What it should be |
|---|---|---|
| AI research | LinkedIn scrape + recent news headline | Structured buying signals: funding, hiring patterns, role changes, tech stack, product launches |
| Personalisation | Company name + job title + one scraped fact | Signal-specific context that only makes sense for this prospect right now |
| Automated prospecting | ICP filter → bulk send | Signal monitoring → qualified shortlist → human approval → send |
| Human in the loop | A review screen most users skip in bulk | Approval required before any email sends; research surfaced for context |
The test for real research: ask the tool to produce output on a prospect you know well. If the research tells you things you already knew from a 10-second Google search, it's not research — it's data formatting. Real research surfaces something you didn't know: that their Head of Sales just changed, that they're hiring five SDRs right now, that they just launched into a new market. If it doesn't add signal, it's not doing the job the name implies.
The Reply Rate Test
Here's what to ask for: median reply rate, not average. Average reply rates are dragged up by outliers — one customer with a perfect ICP and a great content strategy inflates the whole cohort. Median is more honest.
Also ask for the distribution. If 10% of customers see 5%+ reply rates and 60% see under 1%, that's a very different product from one where most customers cluster around 2–3%. The top decile numbers in the case study library don't tell you what your experience will look like.
If the vendor can't or won't share this data, draw the conclusion that the data doesn't support their pitch. Companies with strong metrics share them.
Red Flags That Tell You Everything
Some things aren't ambiguous. If you see any of these, adjust your expectations accordingly:
- No trial, or trial locked to their demo data. There's one reason for this.
- "Proprietary AI" with no specifics. Every vendor has proprietary AI. The ones who don't say what it does usually don't have a meaningful answer.
- Reply rate data as case studies only. Selected success stories are not performance data. They are marketing.
- The demo rep controls every input. You should be able to paste in a prospect name mid-demo and watch the tool work live. If that makes them uncomfortable, ask yourself why.
- Approval step clearly designed to be bulk-skipped. If the UI makes it easy to approve 200 emails in 90 seconds with no friction, the "human in the loop" feature is cosmetic. A real approval workflow surfaces the research, shows you the draft, and makes you engage with it before confirming.
- Volume as the primary value proposition. "Send at scale" is not a benefit if what scales is mediocrity. The pitch should be about relevance, not volume.
What a Real Trial Looks Like
If you get to a trial, structure it properly. Take 15–20 of your actual target accounts — a realistic mix of high-signal and low-signal prospects. Run them through the tool. Evaluate three things separately: the quality of the research output, the quality of the first draft, and the friction level of the approval step.
Don't judge the tool on whether the email is grammatically correct. Judge it on whether the research gave you something to work with, whether the draft reflects that research, and whether you'd genuinely want to send that email to that person on that day. That's the bar. Not "is this better than a generic template?" — that's the wrong comparison. The bar is "would a well-briefed SDR send this?"
If you can answer yes to that for more than half the trial outputs, you might have found something worth buying. If you're editing every email significantly before approving, or skipping half the contacts because the research is thin, you know what you need to know.
The AI SDR market is full of demos optimised to impress and products optimised for volume. The two are not the same thing. The buyers who avoid getting burned are the ones who bring their own prospects to the evaluation, ask the uncomfortable questions about methodology, and treat the trial as a test — not an onboarding. That's not a high bar. It's just a bar that most buyers skip in the excitement of the pitch.
Evaluate us. Bring your own prospects.
Drumroll runs on your real targets. No demo sandbox, no pre-loaded data. See what the research surfaces on accounts you actually care about.
No spam · Just a heads-up when your spot is ready