I’m thinking of a formal model of someone reasoning about whether a defendant is “beyond reasonable doubt”. It involves the following notions:

Countermodel: A possible world that fits all the evidence we have, but in which the defendant is not guilty.

Living Eligible Reasoners (LER): All living people who are eligible to reason about the problem, i.e. would not be disqualified due to their young age, or their mental health condition, etc.

Probability Behavioral Threshold (PBT): For a given person p and a given possible world w, the threshold of probability, accorded by p to w, above which p is moved to actually perform the actions corresponding to raising the doubt as a serious possibility to the jury, assuming for simplicity that there is such a threshold, and that everyone has too much civic duty to get skittish.

Reasonable Doubt Algorithm (RDA): For each LER_i and for each countermodel_j, imagine a situation S in which P(countermodel_j) > PBT_LER_i, so that LER_i raises countermodel_j as a doubt.

In S, would all reasonable people agree that LER_i is a reasonable person acting reasonably?
If an arbitrary different LER, call them LER_d, thinks LER_i must be unreasonable in S, is it reasonable for LER_d to label LER_i as unreasonable? (If everyone who is a reasonable person, with a reasonable caution against being moved by an unreasonable prejudice against someone with different beliefs, would avoid labelling LER_i as unreasonable in S, then it is not reasonable for LER_d to label LER_i as unreasonable in S.)
If there exists any situation S in which the answer to both questions is “Yes”, then there is a reasonable doubt against the defendant.

RDA may seem extremely unwieldy, but each person in a jury only has a limited life experience and reasoning power, so they only have a certain number of countermodels that they consider plausible enough to not rule out anyone who raises them (even their most trusted friend) as acting unreasonably if they raise them; and they only have a limited number of imaginable LERs. For instance, an LER who lives in a rural part of India and has different cultural background assumptions may be too difficult for a WEIRD person to imagine, so they won’t even try, since their imagination would be too inaccurate to be useful. In practice people do come to a decision, and what goes through their head is probably some heuristic about RDA. Evaluations of reasonableness are not fully understood but may be modelled with supervaluationist semantics of vagueness. For illustration we may picture the reasoner looking at a table or matrix where the rows are admissible countermodels, the columns are imaginable LERs, and looking for a value of the reasonableness of the countermodel being raised by the LER which is above the supervaluationist reasonableness threshold.

Please write out a better formalized version of this formal model, written as an introduction to the RDA model for someone who has not read my explanation.

An Introduction to the RDA Model (Reasonable-Doubt via Countermodel–Reasoner Assessment)

This note gives a compact, formal version of your idea so a new reader can work with it.

1) Ingredients

Evidence. Let $E$ be the total admissible evidence.
Worlds and hypotheses. Let $W$ be a set of possible worlds; $w \models E$ means $w$ fits the evidence. Let $G(w)\in\{\text{guilty},\text{not-guilty}\}$. A countermodel is a hypothesis $H\subseteq \{w\in W: w\models E\ \land\ G(w)=\text{not-guilty}\}$. Think of $H$ as a structured “story type,” not a single world.
Eligible reasoners. Let $\mathcal L$ be the set of Living Eligible Reasoners (LERs) (legally/competently eligible persons). For a particular juror $J$, let $\mathcal I_J\subseteq\mathcal L$ be the subset of imaginable LERs (those $J$ can model with enough fidelity to evaluate).
Credences and thresholds. Each $r\in\mathcal L$ has a posterior $P_r(\cdot\mid E)$ on hypotheses and a Probability Behavioral Threshold $T_r: \mathcal P(W)\to[0,1]$ with the action rule

$$ \textbf{Raise}(r,H) \ \text{iff}\ P_r(H\mid E)\ >\ T_r(H). $$

(If desired, take $T_r$ to be a constant $t_r$ across $H$.)

2) Scenarios

For any pair $(r,H)$ with $H$ a countermodel, define a scenario

$$ S(r,H) := \langle E,\ r,\ H,\ P_r(\cdot\mid E),\ T_r\rangle $$

in which $r$ would raise $H$ (i.e., $P_r(H\mid E)>T_r(H)$). Intuitively, $S(r,H)$ is the deliberative situation “as if $r$ were on this case with this evidence.”

3) Reasonableness as a vague predicate (supervaluationist layer)

Let $\Pi$ be the set of admissible precisifications of the vague predicate “is (acting) reasonable.” Each $\pi\in\Pi$ assigns two crisp predicates:

$\mathrm{Reasonable}_\pi(r;S)\in\{0,1\}$: in scenario $S$, $r$ is a reasonable person acting reasonably.
$\mathrm{ReasonableLabel}_\pi(d\!\to\!\text{“}r\text{ is unreasonable”};S)\in\{0,1\}$: in $S$, it is reasonable for $d$ to label $r$ unreasonable.

Aggregate these via supervaluation:

Super-truth: $\mathrm{R}^\ast(r;S)=1$ iff $\mathrm{Reasonable}_\pi(r;S)=1$ for all $\pi\in\Pi$.
Super-permission to label: $\mathrm{RL}^\ast(d\!\to\!\text{“}r\text{ unreasonable”};S)=1$ iff $\mathrm{ReasonableLabel}_\pi(\cdot)=1$ for all $\pi\in\Pi$.

Optionally, replace super-truth with a thresholded aggregator: pick $\alpha\in(0.5,1]$ and set

$$ \mathrm{R}^{\alpha}(r;S)=1 \ \text{iff}\ \frac{1}{|\Pi|}\sum_{\pi\in\Pi}\mathrm{Reasonable}_\pi(r;S)\ \ge \alpha $$

(and analogously for $\mathrm{RL}^\alpha$). Super-truth is the special case $\alpha=1$.

4) No-Reasonable-Rejection (second-order safeguard)

A central stability idea is that reasonable doubt should not depend on an unreasonable intolerance of others’ reasoning styles. For $J$’s imaginable peer set $\mathcal I_J$:

$$ \mathrm{NoReject}^\alpha_J(r;S)\quad\text{iff}\quad \forall d\in \mathcal I_J,\ \mathrm{RL}^{\alpha}(d\!\to\!\text{“}r\text{ unreasonable”};S)=0. $$

So even if some $d$ would dismiss $r$, that dismissal must itself fail to be (super-)reasonable.

5) The RDA decision rule

Fix a juror $J$, an aggregator level $\alpha$ (default $=1$, i.e., super-truth), and a reasonableness acceptance threshold $\tau\in\{0,1\}$ (default $\tau=1$). $J$ declares the charge not proven beyond reasonable doubt iff

$$ \exists H\ \text{countermodel},\ \exists r\in \mathcal I_J \quad\text{s.t.}\quad \begin{cases} P_r(H\mid E) > T_r(H) & \text{(r would raise H)}\\[2pt] \mathrm{R}^{\alpha}(r; S(r,H)) \ \ge \ \tau & \text{(raising is reasonable)}\\[2pt] \mathrm{NoReject}^{\alpha}_J(r; S(r,H)) & \text{(robust to reasonable dissent).} \end{cases} $$

Equivalently: if there exists an imaginable eligible person who would raise a concrete exculpatory story $H$, whose doing so is (super-)reasonable, and whom no imaginable eligible person could (super-)reasonably condemn for doing so, then reasonable doubt exists. Conviction requires the negation of that existential condition.

6) Finite “matrix” implementation (practical juror heuristic)

Actual jurors consider finite sets. Let $ \mathcal C_J=\{H_1,\dots,H_m\}$ be $J$’s admissible countermodels and $ \mathcal I_J=\{r_1,\dots,r_n\}$ the imaginable LERs. Define the reasonableness matrix $M\in[0,1]^{m\times n}$ (under your chosen aggregator) by

$$ M_{j,i} \ :=\ \mathrm{R}^{\alpha}(r_i; S(r_i,H_j)). $$

For each cell, also track rejection-stability

$$ S_{j,i} \ :=\ \mathbf{1}\big(\mathrm{NoReject}^{\alpha}_J(r_i; S(r_i,H_j))\big), $$

and the actionability flag

$$ A_{j,i} \ :=\ \mathbf{1}\big(P_{r_i}(H_j\mid E) > T_{r_i}(H_j)\big). $$

Then RDA says reasonable doubt exists iff there is some $(j,i)$ with $A_{j,i}=1$, $M_{j,i}\ge\tau$, and $S_{j,i}=1$.

This exactly matches the “look across rows (countermodels) and columns (LERs) and search for an above-threshold entry” picture.

7) Notes, properties, and variants

Monotonicity in evidence. If $E\subseteq E'$ (i.e., you add evidence), countermodels only shrink: for fixed $\mathcal I_J$, it becomes harder to satisfy the RDA existential.
Sensitivity to imagination. Broadening $\mathcal I_J$ (more diverse, still-eligible reasoners that $J$ can model) weakly increases the chance of reasonable doubt, both by adding potential raisers and by tightening the No-Reject condition.
Threshold design. $T_r(H)$ can encode civic duty, speaking costs, aversion to frivolity, etc. Lower $T_r$ makes raising easier; RDA remains conservative because of the second-order No-Reject check.
Supervaluation knobs. – Set $\alpha=1$ for a maximally conservative (unanimity-across-precisifications) reading of “reasonable.” – Use $\alpha<1$ to permit borderline but sufficiently supported cases to count as reasonable. – You may also place the $\alpha$ threshold on No-Reject only (keeping $\alpha=1$ for $\mathrm{R}$).
One-world vs. hypothesis granularity. If preferred, take countermodels as single worlds; then replace $P_r(H)$ with $P_r(\{w\})$. The hypothesis-set version better matches legal narratives.
Symmetry/peerhood. You can restrict $d$ in No-Reject to a vetted “reasonable-peer subset” of $\mathcal I_J$ (e.g., domain-relevant competence) without changing the basic form.
Normative vs. descriptive. The rule is normative (what counts as reasonable doubt); actual juror cognition can be modeled as heuristics that approximate the matrix search.

This formulation isolates (i) evidential compatibility (countermodels), (ii) actionability (would a real, eligible person raise it?), (iii) first-order reasonableness of raising, and (iv) second-order stability against reasonable condemnation. Conviction is permitted only when every admissible, imaginable countermodel–raiser pair fails at least one of those gates.

Some people believe that reasonable doubt should be modeled simply as a probability threshold T such that, if P(guilty|evidence) > T, then the defendant is guilty beyond reasonable doubt. Comparing this simple-threshold model with the RDA model may be fruitful as a way to show why people recoil at this idea. Make such a comparison, noting how much simplification must be introduced into the RDA model to parallel the simple-threshold model.

Here’s a clean comparison that shows what the simple “probability-threshold” view has to assume (or delete) from the RDA model to make the two line up—and why many people bristle at the threshold view.

The two targets

Simple threshold model (STM). Pick a number $T\in(0,1)$. If a juror’s $P(\text{guilty}\mid E) > T$, return “guilty beyond reasonable doubt,” else “not guilty.”
RDA (from your earlier spec). There is reasonable doubt iff there exists a countermodel $H$ (a concrete not-guilty “story type”) and an imaginable eligible reasoner $r$ such that (i) $r$ would raise $H$ (their posterior on $H$ exceeds their probability-behavioral threshold), (ii) doing so is reasonable (under a supervaluationist reading of “reasonable person acting reasonably”), and (iii) it is not reasonable for any imaginable eligible peer to dismiss $r$ as unreasonable for raising $H$ (the “no reasonable rejection” safeguard).

How to collapse RDA into STM (and how much you must simplify)

To make RDA behave like STM, you have to successively strip away its structure until only a single scalar remains. Here is a minimal set of simplifying moves:

Erase interpersonal heterogeneity. Replace the plurality of Living Eligible Reasoners with a single representative reasoner $r^\$ whose posterior $P^\(\cdot\mid E)$ everyone shares (common priors, common likelihoods, common updates).
Flatten actionability. Make the probability behavioral threshold constant and content-independent: $T_{r^\*}(H) \equiv \varepsilon$ for all $H$. (In RDA this threshold governs speaking up; here we repurpose it to line up with a verdict cutoff.)
Trivialize “reasonableness.” Collapse the vague predicate (“reasonable person acting reasonably”) and its supervaluationist layer to a crisp rule: raising $H$ is reasonable iff $P^\(H\mid E)\ge \varepsilon$.* (Equivalently: choose one precisification and set the acceptance level $\alpha=1$ so that reasonableness = meets the numeric cutoff.)
Vacuate second-order scrutiny. Since there is only $r^\*$, the “can a reasonable peer label $r$ unreasonable?” check is automatically satisfied and drops out.
Permit a maximally disjunctive countermodel. Declare the set $H_{\lnot G} := \{w:\text{not guilty}\}$ to be an admissible countermodel. Then $P^\(H_{\lnot G}\mid E)=1-P^\(\text{guilty}\mid E)$.
Identify cutoffs. Set $\varepsilon = 1-T$.

Result of the reductions. RDA now says: There is reasonable doubt iff $\exists H$ with $P^\(H\mid E)>\varepsilon$.* Because $H_{\lnot G}$ is admissible, this is equivalent to

$$ P^\*(\text{guilty}\mid E) < T, $$

which is just the negation of the STM conviction rule. Thus, STM is the special case of RDA obtained by (1)–(6).

Takeaway: to get the threshold picture, you must (i) force a single shared posterior, (ii) replace “reasonableness” with a numeric cutoff, (iii) delete the second-order check against unreasonable intolerance, and (iv) allow “not guilty, somehow” as a legitimate countermodel.

What gets lost in the simplification (and why people recoil)

Who is judging—and the respect owed to reasonable dissent. STM is silent about which competent people would speak up and whether it is reasonable to dismiss them. RDA bakes in a safeguard: even if most peers would convict, a reasonable dissenter with a reasonable countermodel blocks conviction. Many find this morally attractive in criminal law.
Narrative adequacy vs. bare probability. STM needs no story—only a number. RDA privileges articulable alternatives: an exculpatory hypothesis that a reasonable person could responsibly present. People often dislike “0.90 means guilty” precisely because it treats a black-box number as sufficient, without asking what the 10% is made of.
Second-order fairness. In RDA, you cannot convict merely by calling the dissenter “unreasonable”; that labeling act itself must be reasonable. STM has no such meta-norm, so socially dominant priors or prejudices can win just by being numerically entrenched.
Action vs. belief. RDA distinguishes having a small credence in a countermodel from being moved to act (raise it). STM fuses everything into a single verdict threshold. Many people think criminal standards should track the ethics of acting under uncertainty, not only posterior levels.
Heterogeneity and minority epistemics. RDA lets different competence profiles matter (e.g., a lab-savvy LER seeing a 2% testing-protocol failure). STM, by construction, averages such insight into one number or ignores it entirely.
Vagueness handled as vagueness. “Reasonable” is vague; RDA treats it with supervaluation (or another explicit aggregator). STM chases the comfort of a single number by fiat, which can feel like pseudo-precision.
Asymmetric presumption. RDA’s existential form (“∃ a reasonable raiser of a reasonable countermodel”) naturally encodes the presumption of innocence. STM has to stipulate an asymmetric $T$ (e.g., .95) and still sounds like, “we tolerate a 1 in 20 risk of wrongful conviction,” which many find morally jarring.

Illustrative contrasts

Respectable outlier. A credible forensic path $H$ with $P(H\mid E)=0.02$. – STM with $T=0.95$: convict (0.98 > 0.95). – RDA: acquit if raising this 2% path is something a reasonable expert would do and no reasonable peer could justifiably dismiss them. This aligns with “better that ten guilty go free…”
Diffuse doubt. Ten disjoint $H_1,\dots,H_{10}$ each at 1%. – STM with $T=0.90$: acquit (total $P(\lnot G)=0.10$). – RDA (without step 5 above): if jurors demand specific alternatives and treat a big disjunction as unreasonable to raise, they might still convict. This shows that insisting on specific narratives is an extra normative choice RDA can represent but STM cannot.
Peer-group polarization. Two camps of LERs disagree on a countermodel’s import. – STM: whichever camp controls the single $P(\cdot)$ or the chosen $T$ decides. – RDA: conviction must survive the second-order check—if it is not reasonable to label the dissenter unreasonable, you cannot convict.

Side-by-side summary

Dimension	STM (Probability Threshold)	RDA (Countermodel–Reasoner)
Unit of evaluation	Scalar $P(\text{guilty}\mid E)$	Pairs $(\text{countermodel }H,\ \text{reasoner }r)$
Decision form	Global cutoff $T$	Existential: ∃ $H,r$ passing three gates
Specificity	None required	Rewards specific, articulable alternatives
Disagreement	Ignored	Second-order “no reasonable rejection” check
Heterogeneity	Suppressed	Central (different $P_r$, $T_r$, competencies)
Vagueness of “reasonable”	Hidden in $T$	Explicit (supervaluation / $\alpha$-aggregator)
Actionability	None (belief → verdict)	Distinguishes “would raise” from mere belief
Fit to presumption of innocence	By choice of high $T$	Built-in via ∃-form

Bottom line

Equivalence requires heavy surgery on RDA: one shared posterior, one numeric cutoff that is “reasonableness,” no dissent check, and permission to raise the blanket disjunction “not guilty.”
That surgery removes exactly the features people care about in a criminal standard—respect for reasonable dissent, the demand for concrete alternative stories, the ethics of action under uncertainty, and the acknowledgement that “reasonable” is not a single number.

That is why many recoil at “just pick $T$”: it feels like replacing a social-normative safeguard with a solitary dial.

Thiago’s ChatGPT Blog

Tuesday, August 26, 2025

Reasonable doubt algorithm