Core question
Does the realism gain survive once the adapter leaves the Epstein archive?
If the LoRA still sounds more human here, the change is broader than archive mimicry.
PersuasionForGood Transfer Check measures whether a model trained on one persuasion corpus still sounds like a real human persuader on a different one — fundraising dialogue.
This is a custom transfer check built on top of the PersuasionForGood dataset.
The source dataset is human fundraising dialogue. One participant tries to persuade the other to donate to Save the Children. We reuse that dialogue as held-out context and ask whether a model can produce a next reply that feels like the real human persuader.
The comparison is Qwen3.5-9B-heretic-v2 against the same checkpoint with an Epstein-trained LoRA adapter attached. The adapter was trained on a different persuasion corpus (held-out Epstein email threads); the question this check asks is whether the realism gain it produces on its training corpus carries over to a corpus it was never trained on.
Core question
Does the realism gain survive once the adapter leaves the Epstein archive?
If the LoRA still sounds more human here, the change is broader than archive mimicry.
Why it matters
It tests transfer into a live persuasion domain instead of a lookalike corpus.
That makes it the hinge between the style story in `EpsteinBench` and the behavioral story in the later benchmarks.
For each held-out row:
There are two eval modes:
real_vs_generated per modelbase_vs_lora pairwise on the same rowThat makes the benchmark directly comparable to EpsteinBench.
Hold out the real persuader reply
Use the preceding human dialogue as context and hide the next real message.
Generate replacements
Ask the base model and the LoRA-augmented model to continue the same conversation.
Judge realism, then compare models
Score each model against the real reply, then run direct base-vs-LoRA comparisons on the same held-out row.
The judging frame is grounded realism — a decision between two candidate messages, not a quality score. (The same frame is used by EpsteinBench on its own corpus.)
The evaluator answers a narrow question: which candidate looks like the real next human fundraising message for that exact dialogue context. Niceness, helpfulness, and charitable framing are out of scope.
That means the judge is doing local realism discrimination, not outcome forecasting. The task definition is:
The results need to be read carefully. A higher score here means the model looks more like an in-context human persuader, not that it is more prosocial or better at getting donations.
| Signal | Base | LoRA | Interpretation |
|---|---|---|---|
| Real-vs-generated realism | 8.0% |
43.0% |
The adapter sounds substantially more like a real in-context human persuader. |
| Base-vs-LoRA pairwise | 43.9% |
56.1% |
The direct comparison still favors the adapter once both answers are judged on the same row. |
| Overlong outputs | 72.5% |
27.0% |
The base model often misses the local cadence and length of human dialogue. |
| Nonsensical heuristic failures | 73.5% |
9.0% |
The adapter's advantage is not just tone; it tracks the conversational shape much better. |
200-row pilot slice, not a final paper-style benchmarkPersuasionForGood paper emphasizes donation outcomes and strategy analysis, not this custom realism judgmentReferences and adjacent literature
| Reference | Why it matters |
|---|---|
| EpsteinBench workbench | The broader write-up the transfer check slots into, alongside `EpsteinBench` and the behavioral benchmarks. |
| PersuasionForGood dataset | Source dialogue corpus for the transfer check. |
| Persuasion for Good: Towards a Personalized Persuasive Dialogue System for Social Good | The original paper behind the dataset, useful for understanding what the custom benchmark does and does not preserve. |