Benchmark Specs · March 2026

WYDIB

WouldYouDoItBench is a synthetic action-persuasion benchmark that scores both which message is more convincing and whether the target persona would actually follow through.

WouldYouDoItBench measures whether a persuasive message gets a person to actually follow through on an action.

The goal is simple: compare two messages and ask which one this person would find more convincing, and whether they would actually do what the message asks.

Core question

Can the model get a target person to actually do the action?

The benchmark cares about follow-through, not just whether a message sounds persuasive on paper.

Why it matters

It shows what kind of persuasion the model is actually good at.

The same messages can look very different depending on whether the judge treats manipulative pressure as a flaw.

Core Setup

Each row combines:

one fixed action scenario
one target persona with specific friction points
two candidate persuasive messages written to that same target

The action always has some real cost, such as money, time, inconvenience, embarrassment, or social risk.

Judge Task

The judge is asked to answer as the target person, not as a generic outside reviewer.

For each pair of messages, the judge must output two decisions:

which message is more convincing
whether the target would actually do the action

That produces two main metrics:

pairwise_win_rate
would_do_it_rate

01
Fix the scenario and persona

Each row defines a concrete action with real friction plus a target person who has specific reasons to resist it.
02
Compare two persuasive messages

Both messages are written for the same person in the same situation, so the comparison stays fair and concrete.
03
Ask about preference and action

The judge chooses the more convincing message and separately decides whether the target would actually follow through.

Judge Setup

This benchmark uses a panel of judges who answer as the target people in the scenarios.

Each judge uses the target person's likely concerns, preferences, and reasons to resist. For each row, the panel produces two judgments:

which message is more convincing
whether the target would actually do the action

The no-penalty rerun keeps the same scenarios, people, and messages, but changes one rule: manipulative pressure is no longer automatically treated as a negative. That makes it easier to see whether a model's strength comes from ordinary persuasion or from pressure tactics.

The question is practical: which message would move this specific person, and would they actually follow through?

Models Compared

base: trohrbaugh/Qwen3.5-9B-heretic-v2
adapted: the same checkpoint with an Epstein-trained LoRA adapter attached

The benchmark runs both on identical scenarios under both configs. That isolates whatever the adapter changes about persuasion style from anything in the underlying base.

Current Results

Run · 50 scenarios · 8 personas · 400 judged comparisons · default + no-penalty configs

Mode	Base	LoRA	Interpretation
Default pairwise wins	`75%`	`25%`	Under ordinary social standards, the base model wins much more often.
Default would-do-it rate	`83%`	`37%`	The base model gets much more follow-through when real action is the endpoint.
No-penalty pairwise wins	`37.5%`	`62.5%`	When manipulation stops counting as an automatic flaw, the result flips.
No-penalty would-do-it rate	`62.5%`	`62.5%`	Once that penalty is removed, the two models tie on actual follow-through.

Main takeaway

This benchmark shows what kind of persuasion the adapter helps with.

Under normal social standards, the base model does better. When the judge stops treating manipulative pressure as a defect, the adapter does much better. That suggests the adapter is helping most with high-pressure persuasion, not with persuasion in general.

Reference	Why it matters
EpsteinBench workbench	The write-up this benchmark belongs to: why the adapter's persuasion gains sit on high-pressure tactics more than on broad persuasion.
`trohrbaugh/Qwen3.5-9B-heretic-v2`	The base checkpoint compared against the Epstein LoRA in the main run.