#### Supplement to Inductive Logic

## Proof of the Non-Falsifying Refutation Theorem

Here again we explicitly treat the case where only
*condition-independence* is assumed. If
*result-independence* holds as well, all occurrences of
‘(*c*^{k−1}·*e*^{k−1})’ may
be dropped, which gives the results stated in the text. If neither
*independence condition* holds, all occurrences of
‘*c*_{k}·(*c*^{k−1}·*e*^{k−1})’
are replaced by
‘*c*^{n}·*e*^{k−1}’ and
occurrences of
‘*b*·*c*^{k−1}’ are replaced
by ‘*b*·*c*^{n}’.

The proof of Convergence Theorem 2 requires the introduction of one
more concept, that of the *variance in the quality of
information* for a sequence of experiments or observations,
VQI[*c*^{n} | *h*_{i}/*h*_{j} | *b*]. The
quality of the information QI from a specific outcome sequence
*e*^{n} will vary somewhat from the *expected*
*quality of information* for conditions *c*^{n}. A common
statistical measure of how widely individual values tend to vary from
an expected value is given by the *expected squared distance from
the expected value*, which is a quantity called the
*variance*.

Definition: VQI — the Variance in the Quality of Information.

Forh_{j}outcome-compatible withh_{i}onc^{k}, define

VQI[c_{k}|h_{i}/h_{j}|b·(c^{k−1}·e^{k−1})] =∑

_{u}(QI[o_{ku}|h_{i}/h_{j}|b·c_{k}·(c^{k−1}·e^{k−1})] − EQI[c_{k}|h_{i}/h_{j}|b·(c^{k−1}·e^{k−1})])^{2}·P[o_{ku}|h_{i}·b·c_{k}·(c^{k−1}·e^{k−1})];Next define VQI[

c_{k}|h_{i}/h_{j}|b·c^{k−1}] =∑_{{ek−1}}VQI[c_{k}|h_{i}/h_{j}|b·(c^{k−1}·e^{k−1})] ·P[e^{k−1}|h_{i}·b·c^{k−1}].

For a sequence *c*^{n} of observations on which *h*_{j}
is outcome-compatible with *h*_{i}, define
VQI[*c*^{n} | *h*_{i}/*h*_{j} | *b*]
=

∑_{{en}}(QI[e^{n}|h_{i}/h_{j}|b·c^{n}] − EQI[c^{n}|h_{i}/h_{j}|b])^{2}·P[e^{n}|h_{i}·b·c^{n}].

Clearly VQI will be positive unless *h*_{i} and *h*_{j}
agree on the likelihoods of all possible outcome sequences in the
evidence stream, in which case both EQI[*c*^{n} |
*h*_{i}/*h*_{j} | *b*] and
VQI[*c*^{n} | *h*_{i}/*h*_{j} | *b*] equal 0.

VQI[*c*^{n} | *h*_{i}/*h*_{j} | *b*] does not
generally decompose into the sum of the VQI for individual experiments
or observations *c*_{k}. However, when both *independence
conditions* hold, the decomposition into the sum does
follow.

Theorem: The VQI Decomposition Theorem for Independent Evidence:

Suppose bothcondition independenceandresult-independencehold. Then

VQI[ c^{n}|h_{i}/h_{j}|b]= n

∑

k=1VQI[ c_{k}|h_{i}/h_{j}|b].

For the Proof, we employ the following abbreviations:

Q[ e_{k}]= QI[ e_{k}|h_{i}/h_{j}|b·c_{k}]Q[ e^{k}]= QI[ e^{k}|h_{i}/h_{j}|b·c^{k}]E[ c_{k}]= EQI[ c_{k}|h_{i}/h_{j}|b]E[ c^{k}]= EQI[ c^{k}|h_{i}/h_{j}|b]V[ c_{k}]= VQI[ c_{k}|h_{i}/h_{j}|b]V[ c^{k}]= VQI[ c^{k}|h_{i}/h_{j}|b]

The equation stated by the theorem may be derived as follows:

V[c^{n}]

= ∑ _{{en}}(Q[e^{n}] − E[c^{n}])^{2}·P[e^{n}|h_{i}·b·c^{n}]= ∑ _{{en}}((Q[e_{n}]+Q[e^{n−1}]) − (E[c_{n}]+E[c^{n−1}]))^{2}·P[e_{n}|h_{i}·b·c_{n}]·P[e^{n−1}|h_{i}·b·c^{n−1}]= ∑ _{{en−1}}∑_{{en}}((Q[e_{n}]−E[c_{n}]) + (Q[e^{n−1}]−E[c^{n−1}]))^{2}·

P[e_{n}|h_{i}·b·c_{n}]·P[e^{ n−1}|h_{i}·b·c^{n−1}]= ∑ _{{en−1}}∑_{{en}}( (Q[e_{n}]−E[c_{n}])^{2}+ (Q[e^{n−1}]−E[c^{n−1}])^{2}+

2·(Q[e_{n}]−E[c_{n}])·(Q[e^{n−1}]−E[c^{ n−1}]) ) ·P[e_{n}|h_{i}·b·c_{n}]·P[e^{n−1}|h_{i}·b·c^{n−1}]= V[ c_{n}] + V[c^{n−1}] +

2·∑_{{en−1}}∑_{{en}}(Q[e_{n}]·Q[e^{n−1}] − Q[e_{n}]·E[c^{n−1}] − E[c_{n}]·Q[e^{n−1}] +

E[c_{n}]·E[c^{n−1}]) ·P[e_{n}|h_{i}·b·c_{n}]·P[e^{ n−1}|h_{i}·b·c^{n−1}]= V[ c_{n}] + V[c^{n−1}] +

2 · (E[c_{n}]·E[c^{n−1}] − E[c_{n}]·E[c^{n−1}] − E[c_{n}]·E[c^{n−1}] + E[c_{n}]·E[c^{n−1}])= V[ c_{n}] + V[c^{n−1}]= …

= n

∑

k= 1VQI[ c_{k}|h_{i}/h_{j}|b].

By averaging the values of VQI[*c*^{n} | *h*_{i}/*h*_{j} | *b*] over the number of observations *n* we
obtain a measure of the *average variance in the quality of the
information* due to *c*^{n}. We represent this average by
underlining ‘VQI’.

Definition: The Average Variance in the Quality of Information

VQI[c^{n}|h_{i}/h_{j}|b] = VQI[c^{n}|h_{i}/h_{j}|b] ÷n.

VQI is only a true average,
a sum of *n* terms divided by *n*,
when the *independent evidence* conditions hold. But our
definition here does not presuppose *independence*, and the
*n*otion of “averaging” VQI,
VQI, by dividing by
the number of experiments and observations turns out to be useful even
when the evidence is not *independent*.

We are now in a position to state a very general version of the second
part of the *Likelihood Ratio Convergence Theorem*. It applies
to all evidence streams not containing *possibly*
*falsifying outcomes* for *h*_{j}. That is, it applies to
all evidence streams for which *h*_{j} is
*outcome-compatible* with *h*_{i} on each *c*_{k}
in the stream. This theorem is essentially a specialized version of
Chebyshev's Theorem, which is a so-called Weak Law of Large
Numbers. This version of the theorem presupposes neither of the
*independence conditions*.

Theorem 2*: Non-falsifying Likelihood Ratio Convergence Theorem

Choose positive ε < 1, as small as you like, but large enough that (for the number of observationsnbeing contemplated) the value of EQI[c^{n }|h_{i}/h_{j}|h_{i}·b] > −(log ε)/n. ThenP[∨{e^{n}:P[e^{n}|h_{j}·b·c^{n}]/P[e^{n}|h_{i}·b·c^{n}] < ε} |h_{i}·b·c^{n}] ≥

1 −

1 n·

VQI[ c^{n}|h_{i}/h_{j}|b],(EQI[ c^{n}|h_{i}/h_{j}|h_{i}·b] + (log ε)/n)^{2}

Thus, provided that the average expected quality of the information,
EQI[*c*^{n} | *h*_{i}/*h*_{j} | *b*], for the
stream of experiments and observations *c*^{n} doesn't get
too small (as *n* increases), and provided that the average variance,
VQI[*c*^{n} | *h*_{i}/*h*_{j} | *b*], doesn't
blow up (e.g. it is bounded above), hypothesis *h*_{i} say it is
highly likely that outcomes of *c*^{n} will be such as to make
the likelihood ratio against *h*_{j} as compared to
*h*_{i} as small as you like, as *n* increases.

**Proof:** Let

V = VQI[ c^{n}|h_{i}/h_{j}|b]E = EQI[ c^{n}|h_{i}/h_{j}|b]Q[ e^{n}]= QI[ e^{n}|h_{i}/h_{j}|b·c^{n}] = log(P[e^{n}|h_{i}·b·c^{n}]/P[e^{n}|h_{j}·b·c^{n}])

Choose any small ε > 0, and suppose (for *n* large
enough) that E > −(log ε)/*n*. Then we have

V = ∑{ e^{n}:P[e^{n}|h_{j}·b·c^{n}] > 0} (E − Q)^{2}·P[e^{n}|h_{i}·b·c^{n}]≥ ∑{ e^{n}:P[e^{n}|h_{j}·b·c^{n}] > 0 & Q[e^{n}] ≤ −(log ε)} (E − Q)^{2}·P[e^{n}|h_{i}·b·c^{n}]≥ (E + (log ε)) ^{2}· ∑{e^{n}:P[e^{n}h_{j}·b·c^{n}] > 0 & Q[e^{n}] ≤ −(log ε)}P[e^{n}h_{i}·b·c^{n}]= (E + (log ε)) ^{2}·P[∨{e^{n}:P[e^{n}h_{j}·b·c^{n}] > 0 &Q[e^{n}] ≤ log(1/ε)}h_{i}·b·c^{n}]= (E + (log ε)) ^{2}·P[∨{e^{n}:P[e^{n}|h_{j}·b·c^{n}]/P[e^{n}|h_{i}·b·c^{n}] ≥ ε} |h_{i}·b·c^{n}]

So,

V n·(E + (log ε)/n)^{2}= V/(E + (log ε)) ^{2}≥ P[∨{e^{n}:P[e^{n}|h_{j}·b·c^{n}]/P[e^{n}|h_{i}·b·c^{n}] ≥ ε} |h_{i}·b·c^{n}]= 1 − P[∨{e^{n}:P[e^{n}|h_{j}·b·c^{n}]/P[e^{n}|h_{i}·b·c^{n}] < ε} |h_{i}·b·c^{n}]

Thus, for any small ε > 0,

P[∨{e^{n}:P[e^{n}|h_{j}·b·c^{n}]/P[e^{n}|h_{i}·b·c^{n}] < ε} |h_{i}·b·c^{n}]≥

1 −

V n·(E + (log ε)/n)^{2}

(End of Proof)

The previous theorem shows that when VQI is bounded above, a
sufficiently long stream of evidence will very likely result in the
refutation of false competitors of a true hypothesis. This claim holds
regardless of whether the evidence can be chunked into
*independent* pieces. However, we can use the *independence
conditions* to describe a very simple provision under which
VQI is indeed bounded above. This gives us the theorem stated
in the main text.

Likelihood Ratio Convergence Theorem 2 — The Non-falsifying Refutation Theorem.

Suppose that theindependent evidence conditionshold. And suppose γ > 0 is a number smaller than 1/e^{2}(≈ .135). And suppose that for each possible outcomeo_{ku}of each observation conditionc_{k}inc^{n}, eitherP[o_{ku}|h_{i}·b·c_{k}·(c^{k−1}·e^{k−1})] = 0or

P[o_{ku}|h_{j}·b·c_{k}·(c^{k−1}·e^{k−1})] /P[o_{ku}|h_{i}·b·c_{k}·(c^{k−1}·e^{k−1})] ≥ γ.Choose positive ε < 1, as small as you like, but large enough (for the number of observations

nbeing contemplated) that the value of EQI[c^{n }|h_{i}/h_{j}|h_{i}·b] > −(log ε)/n. ThenP[∨{e^{n}:P[e^{n}h_{j}·b·c^{n}] /P[e^{n}h_{i}·b·c^{n}] < ε}h_{i}·b·c^{n}] >

1 −

1 n·

(log γ) ^{2}(EQI[ c^{n}|h_{i}/h_{j}|h_{i}·b] + (log ε)/n)^{2}

**Proof**: This follows from Theorem 2* together with the
following observation, which holds given the *independence
conditions*:

If for eachc_{k}inc^{n}, for each of its possible outcomeso_{ku}, eitherP[o_{ku}|h_{j}·b·c_{k}] = 0 orP[o_{ku}|h_{j}·b·c_{k}]/P[o_{ku}|h_{i}·b·c_{k}] ≥ γ > 0, where γ < 1, then V = VQI[c^{n}|h_{i}/h_{j}|b] ≤ (log γ)^{2}.

To see that this observation holds, assume its antecedent.

- First notice that when 0 <
*P*[*e*_{k}|*h*_{j}·*b*·*c*_{k}] <*P*[*e*_{k}|*h*_{i}·*b*·*c*_{k}] we have(log[

*P*[*e*_{k}|*h*_{i}·*b*·*c*_{k}]/*P*[*e*_{ k}|*h*_{j}·*b*·*c*_{k}]])^{2}·*P*[*e*_{k}|*h*_{i}·*b*·*c*_{k}] ≤ (log γ)^{2}·*P*[*e*_{k}|*h*_{i}·*b*·*c*_{k}].So we only need establish that when

*P*[*e*_{k}|*h*_{j}·*b*·*c*_{k}] >*P*[*e*_{k}|*h*_{i}·*b*·*c*_{k}] > 0, we will also have this relationship — i.e., we will also have(log[

*P*[*e*_{k}|*h*_{i}·*b*·*c*_{k}]/*P*[*e*_{ k}|*h*_{j}·*b*·*c*_{k}]])^{2}·*P*[*e*_{k}|*h*_{i}·*b*·*c*_{k}] ≤ (log γ)^{2}·*P*[*e*_{k}|*h*_{i}·*b*·*c*_{k}].(Then it will follow easily that VQI[

*c*^{n}|*h*_{i}/*h*_{j}|*b*] ≤ (log γ)^{2}, and we'll be done.) - To establish the needed relationship, suppose that
*P*[*e*_{k}|*h*_{j}·*b*·*c*_{k}] >*P*[*e*_{k}|*h*_{i}·*b*·*c*_{k}] > 0. Notice that for all*p*≤*q*,*p*and*q*between 0 and 1, the function*g*(*p*) = (log(*p*/*q*))^{2}· p has a minimum at*p*=*q*, where*g*(*p*) = 0, and (for*p*<*q*) has a maximum value at*p*=*q*/*e*^{2}— i.e., at*p*/*q*= 1/*e*^{2}. (To get this, take the derivative of*g*(*p*) with respect to*p*and set it equal to 0; this gives a maximum for*g*(*p*) at*p*=*q*/*e*^{2}.)So, for 0 <

*P*[*e*_{k}|*h*_{i}·*b*·*c*_{k}] <*P*[*e*_{k}|*h*_{j}·*b*·*c*_{k}] we have(log(

*P*[*e*_{k}|*h*_{i}·*b*·*c*_{k}]/*P*[*e*_{ k}|*h*_{j}·*b*·*c*_{k}]))^{2}·*P*[*e*_{k}|*h*_{i}·*b*·*c*_{k}] ≤ (log(1/*e*^{2}))^{2}·*P*[*e*_{k}|*h*_{j}·*b*·*c*_{k}] ≤ (log γ)^{2}·*P*[*e*_{k}|*h*_{j}·*b*·*c*_{k}](since, for γ ≤ 1/

*e*^{2}we have logγ ≤ log(1/*e*^{2}) < 0; so (logγ)^{2}≥ (log(1/*e*^{ 2}))^{2}> 0). - Now (assuming the antecedent of the theorem), for
each
*c*_{k},VQI[ *c*_{k}|*h*_{i}/*h*_{j}|*b*]= ∑{ *o*_{ku}:*P*[*o*_{ku}|*h*_{ j}·*b*·*c*_{k}] > 0} (EQI[*c*_{k}] − QI[*c*_{k}])^{2}·*P*[*o*_{ku}|*h*_{i}·*b*·*c*_{k}]= ∑{ *o*_{ku}:*P*[*o*_{ku}|*h*_{j}·*b*·*c*_{k}] > 0} (EQI[*c*_{k}]^{2}− 2·QI[*c*_{k}]·EQI[*c*_{k}] + QI[*c*_{k}]^{2}) ·*P*[*o*_{ku}|*h*_{i}·*b*·*c*_{k}]= ∑{ *o*_{ku}:*P*[*o*_{ku}|*h*_{j}·*b*·*c*_{k}] > 0} QI[*c*_{k}]^{2}·*P*[*o*_{ku}|*h*_{i}·*b*·*c*_{k}] − EQI[*c*_{k}]^{2}≤ ∑{ *o*_{ku}:*P*[*o*_{ku}|*h*_{j}·*b*·*c*_{k}] > 0} QI[*c*_{k}]^{2}·*P*[*o*_{ku}|*h*_{i}·*b*·*c*_{k}] ≤ (log γ)^{2}.

So, given *independence*,

VQI[ c_{k}|h_{i}/h_{j}|b]=

(1/ n)· n

∑

k= 1VQI[ c_{k}|h_{i}/h_{j}|b]≤ (log γ) ^{2}.