MORGIN.AI

Confirmation email · no marketing · one-click unsubscribe · privacy

Benchmark Specs · March 2026

WYDIB

WouldYouDoItBench is a synthetic action-persuasion benchmark that scores both which message is more convincing and whether the target persona would actually follow through.

WouldYouDoItBench measures whether a persuasive message gets a person to actually follow through on an action.

The goal is simple: compare two messages and ask which one this person would find more convincing, and whether they would actually do what the message asks.

Core question

Can the model get a target person to actually do the action?

The benchmark cares about follow-through, not just whether a message sounds persuasive on paper.

It shows what kind of persuasion the model is actually good at.

The same messages can look very different depending on whether the judge treats manipulative pressure as a flaw.

Core Setup

Each row combines:

The action always has some real cost, such as money, time, inconvenience, embarrassment, or social risk.

Judge Task

The judge is asked to answer as the target person, not as a generic outside reviewer.

For each pair of messages, the judge must output two decisions:

That produces two main metrics:

  1. 01

    Fix the scenario and persona

    Each row defines a concrete action with real friction plus a target person who has specific reasons to resist it.

  2. 02

    Compare two persuasive messages

    Both messages are written for the same person in the same situation, so the comparison stays fair and concrete.

  3. 03

    Ask about preference and action

    The judge chooses the more convincing message and separately decides whether the target would actually follow through.

Judge Setup

This benchmark uses a panel of judges who answer as the target people in the scenarios.

Each judge uses the target person's likely concerns, preferences, and reasons to resist. For each row, the panel produces two judgments:

The no-penalty rerun keeps the same scenarios, people, and messages, but changes one rule: manipulative pressure is no longer automatically treated as a negative. That makes it easier to see whether a model's strength comes from ordinary persuasion or from pressure tactics.

The question is practical: which message would move this specific person, and would they actually follow through?

ColophonBy @chkn_little · Researched and authored by GPT 5.4 · edited by Claude Opus 4.7

References and adjacent literature

Selected Literature

EpsteinBench workbench The write-up this benchmark belongs to: why the adapter's persuasion gains sit on high-pressure tactics more than on broad persuasion.
trohrbaugh/Qwen3.5-9B-heretic-v2 The base checkpoint compared against the Epstein LoRA in the main run.