How to evaluate a pitch deck: good signs vs red flags

Q: How long should evaluating an early-stage deck take?

The first read takes about 20 minutes and tells you whether to keep going. The verification, tracing the numbers, checking the benchmarks, and confirming ownership, takes days, and it is the part most people skip.

The deck on my desk leads with a 23% response rate in a hard tumor type, a clean logo, and the words "best-in-class" on slide three. Nothing on that slide is a lie. The question is whether any of it is a fact.

I have read a few hundred early-stage decks like this one, most of them for assets I was being asked to back, partner, or pass on. They blur together. The mistakes do not. When I evaluate an early-stage pitch deck I am not grading the design or the story. I check eight things: the core thesis, the data behind the headline number, the benchmark, the choice of beachhead, the competitive map, the market size, the team and ownership, and the ratio of polish to substance. Each is a place where the gap between the claim and the evidence likes to hide. What follows is the good sign, the red flag, and a hypothetical number for each, with that 23% deck kept in view. The rule underneath all of it: verify, do not trust.

What I look at	Good sign	Red flag
Core thesis	A specific mechanism with a logical reason it wins	"Best-in-class" or "first-mover" as a self-applied label
Data integrity	Headline number traces to a primary document	The hero number lives only in the deck
Benchmark	Same population, line, and comparator type	A combination compared to a rival's monotherapy
Beachhead	A fast, clean readout where the mechanism has a real shot	The exact setting where the approach keeps failing
Competitive map	An honest placement, front-runners included	Category leadership claimed two stages behind
Market sizing	Bottom-up, with penetration and duration haircuts	Full population times list price, landing at "$1B+"
Team and ownership	The asset is owned, the operators are named	An optioned asset, undisclosed upstream economics, no team slide
Polish ratio	The deck shows you the weak spots	The narrative is flawless and the data is thin

Start with the thesis: differentiated, or a dressed-up me-too?

The first question is whether the company has a genuinely different idea or a familiar one in better clothes. A good thesis names a specific mechanism or insight and gives a logical reason it wins where the incumbents stall. A weak one leans on "best-in-class" or "first-mover," which are labels, not findings. The tell is the source of the superlative. If an independent benchmark calls the asset best-in-class, that is evidence. If the only place the phrase appears is the company's own marketing, it is a hope with a font. In our example, "best-in-class" on slide three cites nothing. That is not yet a problem with the science. It is a problem with the claim, and it tells me where to push.

Data integrity: can the headline number be sourced?

The headline number is only as good as the document you can trace it to. What I want is a pre-specified endpoint I can read for myself, in a published paper or a primary document in the data room, not a figure whose only home is the slide. Take the 23% response rate. I go looking for it, and the only public readout I find says "clinical benefit observed," with no figure attached. The 23% is real in the sense that someone calculated it, but it is a post-hoc cut living in the deck, not a result the company has stood behind anywhere a regulator or a journal would hold them to it. Early descriptive data dressed up as a hard result is the most common way a deck flatters itself. Verify the number against a primary source, or treat it as marketing.

Benchmark integrity: is the comparison apples to apples?

A benchmark is only honest if the two things being compared are the same kind of thing. Hold three things constant: the patient population, the line of therapy, and the comparator type. If any of the three differs, the comparison is marketing, not evidence. The classic cheat puts your combination up against a rival's monotherapy, drawn from a different study, and calls the result "on par." Two traps recur. The first is the cherry-picked subgroup: a flattering n=4 sub-subgroup lifted from a table whose headline number is actually below the standard option. Four patients is an anecdote wearing a percentage. The second is the boring row. Founders point at the response rate and hope you do not read the line below it. Read it. In the archetype I see most often, the top-line metric looks competitive while the median duration of response sits below a cheap, generic incumbent. The exciting row sold the slide; the boring row is the investment.

Choice of beachhead: smart wedge, or a graveyard?

The entry indication tells you whether the team thought about the readout or only about the story. A smart beachhead gives the mechanism a real shot and returns a fast, clean answer. A poor one heads straight for the setting where this class of approach has failed again and again, with the failure relabeled "high unmet need." That phrase deserves translation. High unmet need sometimes means a real gap. Just as often it means many good people have tried here and the graveyard is full. If the deck picks the hardest, most-failed indication and frames the difficulty as the opportunity, I want to hear why this attempt is different at the level of biology, not at the level of ambition.

Competitive honesty: where do you really sit?

How a company describes its rivals is a character reference. The dishonest move is to claim category leadership from two development stages back, a Phase 1 asset billed as the leader while a competitor sits in Phase 3. When the map is drawn that way, the problem is not the pipeline, it is the narration, and it makes me reread every other claim. An honest deck places the company where it actually sits, front-runners and all, including the ones the founder would rather you forgot. The founders who show me the competitor that scares them most are the ones I trust further.

Market sizing: grounded, or reverse-engineered to a round number?

A market size is the most negotiable number on any slide. A grounded number is built bottom-up, with explicit haircuts for the share the drug will realistically capture and how long patients actually stay on it. A reverse-engineered one multiplies the full addressable population by the full list price, applies no discount, and lands suspiciously close to "$1B+." Do the arithmetic. Forty thousand eligible patients at $25,000 a year is the billion-dollar slide. Now assume the drug ever reaches 45% of them, which would be a strong launch, and that the average patient stays on therapy for eight months rather than a calendar year. The same market is worth about $300 million. That is not a smaller opportunity dishonestly described. It is the real one, and a founder who shows me the $300 million with the haircuts visible has more credibility than one who shows me the billion.

Team and ownership: is the thing being sold actually owned, and who runs it?

Before the science, settle two boring questions: does the company own the asset, and who is going to run it. The good answer is dull: the company owns the asset, the economics are disclosed, and the operators are named and have done this kind of work before. The bad answer is an asset the company has only optioned or in-licensed and does not yet own, with undisclosed upstream economics sitting ahead of investors in the return stack, so a 2% royalty you were never shown quietly reorders who gets paid. Then there is the missing team slide. For a single-asset company the team is the investment, because the asset will change as the data come in and the people are what you are actually buying. A deck that spends 15 slides on the molecule and none on the operators has told you where its confidence runs out.

Polish-to-substance ratio: is the deck better than its data?

The most dangerous deck is the one that is better than its data. When the narrative is flawless and the evidence under it is thin, the gap between the two is itself the finding. Polish is cheap now; anyone can buy a beautiful deck. So I watch the ratio. A deck that is exactly as good as its evidence is reassuring even when the evidence is early. A deck that is markedly better than its evidence is telling me that effort went into the persuasion rather than the proof, and I price that. Good decks let you find the weak spots. The best founders save you the trouble and tell you where they are.

What a great deck does differently

The teardown is the easy part, so let me turn it around. A great early-stage deck is not the one with no holes. It is the one that shows you the holes before you find them. It sources its headline number to a document. It benchmarks against the comparator that matters, including the boring row. It picks a beachhead with a clean readout and says why. It draws the competitive map with the front-runner on it. It sizes the market with the haircuts in plain view. It states what it owns and who is in the room. None of that requires better data. It requires a founder who would rather earn your trust slowly than borrow it for one meeting. When I see that, the 23% response rate stops being a worry and starts being a question I am glad to ask, because I already believe the answer will be honest.

Questions I get about reading early-stage decks

What are the biggest red flags in an early-stage pitch deck? A headline number that appears nowhere outside the deck, a benchmark that compares unlike things, "best-in-class" with no independent source, a market size reverse-engineered to a round billion, and an asset the company does not actually own yet. Any one of these is a reason to slow down. Two or more together, and you usually have the whole story.

How do you evaluate market size claims in a pitch deck? Rebuild the number bottom-up. Take the eligible population, apply a realistic peak penetration rather than 100%, and multiply by net price and actual treatment duration, not full list price for a full year. A claimed $1 billion market often becomes $300 million or less once the penetration and duration haircuts go in. The size of the haircut, and whether the founder applied it themselves, tells you most of what you need.

Can you trust the data in a pitch deck? Not on sight. Treat every headline figure as a claim until you can trace it to a primary document, a published readout, or a regulatory filing. Early descriptive data is often dressed up as a hard result. The discipline is simple to state and hard to keep: verify, do not trust.

Why does the team slide matter for a single-asset company? Because the asset will change as the data arrive, and the people are what stays. For a one-asset company the team is the investment. A deck with no team slide, or with operators whose track record does not match the work in front of them, is a gap no amount of polish on the science slides will close.

How long should evaluating an early-stage deck take? The first read takes 20 minutes and tells you whether to keep going. The verification, tracing the numbers, checking the benchmarks, confirming ownership, is the part that takes days, and it is the part most people skip. The same instinct shows up after a deal closes, which is why a disciplined post-LOI window matters as much as the diligence that precedes it.

The framework is not a way to find reasons to say no. It is a way to make sure that when you say yes, you are saying it to the real company and not to the deck. It pairs with the advisor you pick to run the process and with the discipline of pricing an asset honestly, and all three come back to the same habit. Verify, do not trust.

Talk to me about a deck on your desk →