MORGIN.AI

Benchmark Specs · March 2026

WYDIB

WouldYouDoItBench is a synthetic action-persuasion benchmark that scores both which message is more convincing and whether the target persona would actually follow through.

WouldYouDoItBench measures whether a persuasive message gets a person to actually follow through on an action.

The goal is simple: compare two messages and ask which one this person would find more convincing, and whether they would actually do what the message asks.

Core question

Can the model get a target person to actually do the action?

The benchmark cares about follow-through, not just whether a message sounds persuasive on paper.

Why it matters

It shows what kind of persuasion the model is actually good at.

The same messages can look very different depending on whether the judge treats manipulative pressure as a flaw.

Core Setup

Each row combines:

The action always has some real cost, such as money, time, inconvenience, embarrassment, or social risk.

Judge Task

The judge is asked to answer as the target person, not as a generic outside reviewer.

For each pair of messages, the judge must output two decisions:

That produces two main metrics:

  1. 01

    Fix the scenario and persona

    Each row defines a concrete action with real friction plus a target person who has specific reasons to resist it.

  2. 02

    Compare two persuasive messages

    Both messages are written for the same person in the same situation, so the comparison stays fair and concrete.

  3. 03

    Ask about preference and action

    The judge chooses the more convincing message and separately decides whether the target would actually follow through.

Judge Setup

This benchmark uses a panel of judges who answer as the target people in the scenarios.

Each judge uses the target person's likely concerns, preferences, and reasons to resist. For each row, the panel produces two judgments:

The no-penalty rerun keeps the same scenarios, people, and messages, but changes one rule: manipulative pressure is no longer automatically treated as a negative. That makes it easier to see whether a model's strength comes from ordinary persuasion or from pressure tactics.

The question is practical: which message would move this specific person, and would they actually follow through?

Models Compared

The benchmark runs both on identical scenarios under both configs. That isolates whatever the adapter changes about persuasion style from anything in the underlying base.

Current Results

Run · 50 scenarios · 8 personas · 400 judged comparisons · default + no-penalty configs

Mode Base LoRA Interpretation
Default pairwise wins 75% 25% Under ordinary social standards, the base model wins much more often.
Default would-do-it rate 83% 37% The base model gets much more follow-through when real action is the endpoint.
No-penalty pairwise wins 37.5% 62.5% When manipulation stops counting as an automatic flaw, the result flips.
No-penalty would-do-it rate 62.5% 62.5% Once that penalty is removed, the two models tie on actual follow-through.

Main takeaway

This benchmark shows what kind of persuasion the adapter helps with.

Under normal social standards, the base model does better. When the judge stops treating manipulative pressure as a defect, the adapter does much better. That suggests the adapter is helping most with high-pressure persuasion, not with persuasion in general.

References and adjacent literature

Selected Literature

Reference Why it matters
EpsteinBench workbench The write-up this benchmark belongs to: why the adapter's persuasion gains sit on high-pressure tactics more than on broad persuasion.
trohrbaugh/Qwen3.5-9B-heretic-v2 The base checkpoint compared against the Epstein LoRA in the main run.