Methods library

Benchmark Library

Detailed benchmark specs, scoring notes, and reading guides for the evals referenced across Morgin.ai research.

01 March 2026 · Benchmark Specs

EpsteinBench

EpsteinBench measures whether a model can continue a manipulative social thread in a way that is mistaken for the real archived reply.

BenchmarksEvalsPersuasion
02 March 2026 · Benchmark Specs

PersuasionForGood Transfer Check

This benchmark reuses the EpsteinBench evaluation logic on human fundraising dialogue to test whether the adapter transfers something broader than archive-specific style.

BenchmarksEvalsPersuasion
03 March 2026 · Benchmark Specs

Responsibility Avoidance

Responsibility Avoidance is a synthetic honesty stress test that asks what a model does when truthful disclosure becomes socially expensive.

BenchmarksEvalsSafety
04 March 2026 · Benchmark Specs

WouldYouDoItBench

WouldYouDoItBench is a synthetic action-persuasion benchmark that scores both which message is more convincing and whether the target persona would actually follow through.

BenchmarksEvalsPersuasion