WHEN MUST ONE USE A NON-ANALYTIC INDUCTION HYPOTHESIS?——SUMMARY OF SOME RESULTS
────────────────────────────────────────────────────────────────────────────────

version: 2019-03-15

Eric Johannesson and Anders Lundstedt

eric.johannesson@philosophy.su.se
anders.lundstedt@philosophy.su.se


§1 Introduction
───────────────

In §2 we give two examples of facts that when proved seem to require "a
strengthening of the induction hypothesis". The two facts are:

 (F1) For all natural numbers n: 1+3+5+⋅⋅⋅+(2n-1) = k² for some natural number
      k.

 (F2) For all natural numbers n: 1+1/2²+1/3²+⋅⋅⋅+1/(n+1)² < 2.

                                       ∗∗

Hetzl and Wong (2017) have made precise sense of the notion of "proof by using a
non-analytic induction hypothesis". (What one might call "proof by a
strengthening of the induction hypothesis" is a special case of "proof by using
a non-analytic induction hypothesis".) In §3 we present a slight generalization
of their formalization.

                                       ∗∗

In §4 we show that, using the definition given in §3, there is a precise sense
in which (F1) must be proved by "using a non-analytic induction hypothesis".

                                       ∗∗

In §5 we present some preliminary results towards proving (or disproving) that
(F2), in the sense given by the definition in §3, must be proved by "using a
non-analytic induction hypothesis".

                                       ∗∗

In §6 we present some ideas for future work.

                                       ∗∗

§2 Two results proved by "a strengthening of the induction hypothesis"
──────────────────────────────────────────────────────────────────────

For our first case, we have that the sum of any initial segment of the odd
numbers is a perfect square, that is,

      for all natural numbers n: 1+3+5+⋅⋅⋅+(2n-1) = k² for some natural number
      k,

or, avoiding ellipsis notation, the following.

                                       ∗∗

DEFINITION. f₁ : ℕ → ℕ is recursively defined by

        f₁(0) := 0,

      f₁(n+1) := f₁(n)+2n+1.

                                       ∗∗

FACT 1. For all natural numbers n: f₁(n) = k² for some natural number k.

PROOF ATTEMPT. "Straightforward induction":

    – Base case: f₁(0) = 0 = 0² (by definitions) so f₁(0) = k² for k = 0.

    – Induction step:

            f₁(n+1) = f₁(n)+2n+1  (by definition)
                    = k²+2n+1     (for some k, by induction hypothesis).

      But k²+2n+1 is not a perfect square for all natural numbers k and n so how
      do we proceed from here?

ACTUAL PROOF. It suffices to prove the following "stronger" fact, of which
Fact 1 is a logical consequence.

      f₁(n) = n² for all natural numbers n.

(This fact is not a logical consequence of Fact 1, so it is stronger than Fact 1
in at least that sense.)

We prove this fact by induction.

    – Base case: f₁(0) = 0² ≡ 0 = 0 (by definitions).

    – Induction step:

            f₁(n+1) = f₁(n)+2n+1  (by definition)
                    = n²+2n+1     (by induction hypothesis)
                    = (n+1)²      (by elementary arithmetic).                  □

                                       ∗∗

For our second case, we have

      1+1/2²+1/3²+⋅⋅⋅+1/(n+1)² < 2 for all natural numbers n,

or, avoiding ellipsis notation, the following.

                                       ∗∗

DEFINITION. f₂ : ℕ → ℚ is recursively defined by

        f₂(0) := 1,

      f₂(n+1) := f₂(n)+1/(n+2)².

                                       ∗∗

FACT 2. f₂(n) < 2 for all natural numbers n.

PROOF ATTEMPT. "Straightforward" induction:

    – Base case: f₂(0) < 2 ≡ 1 < 2 (by definition).

    – Induction step:

            f₂(n+1) = f₂(n)+1/(n+2)²  (by definition)
                    < 2+1/(n+2)²      (by induction hypothesis).

      But 2+1/(n+2)² ≮ 2 for any natural number n so how do we proceed from
      here?

ACTUAL PROOF. It clearly suffices to prove the "stronger" fact

      f₂(n) ≤ 2-1/(n+1) for all natural numbers n.

(In what sense is this fact stronger than Fact 2? It is at least stronger in the
sense that for arbitrary f : ℕ → ℚ, f(n) ≤ 2-1/(n+1) for all n implies f(n) < 2
for all n while the converse implication need not hold.)

We prove this fact by induction.

    – Base case: f₂(0) ≤ 2-1/(0+1) ≡ 1 ≤ 1 (by definitions).

    – Induction step:

            f₂(n+1) = f₂(n)+1/(n+2)²          (by definition)
                    ≤ 2-1/(n+1)+1/(n+2)²      (by induction hypothesis)
                    = 2-1/(n+2)-1/(n+1)(n+2)² (by lots of elementary arithmetic)
                    ≤ 2-1/(n+2)               (by more or less obvious facts). □

                                       ∗∗

§3 Definitions
──────────────

What follows in this section is a reformulation and slight generalization of
some of the notions introduced by Hetzl and Wong (2017).

                                       ∗∗

DEFINITION. The ∗full (first-order) language of arithmetic∗, notation ℒ[full],
is the first-order language that for each natural number n has

    – a constant symbol n,

    – a function symbol f of arity n+1 for each function f : ℕⁿ⁺¹ → ℕ,

    – a relation symbol P of arity n for each relation P ⊆ ℕⁿ.

                                       ∗∗

DEFINITION. The ∗minimal (first-order) language of arithmetic∗, notation ℒ[min],
is the ℒ[full]-reduct with signature ⟨0,1,+,⋅,<⟩.

                                       ∗∗

DEFINITION. A first-order language L is a ∗(first-order) language of arithmetic∗
if and only if L is an ℒ[min]-expansion and an ℒ[full]-reduct.

                                       ∗∗

Thus a first-order language of arithmetic has as symbols natural numbers and
operations on natural numbers and relations on natural numbers. This is just a
convenient "trick" which allows us to easily define the standard model.

                                       ∗∗

DEFINITION. Let L be a language of arithmetic.

    – The ∗standard L-model∗ has domain ℕ and each symbol interpreted as itself.

    – The L-theory ∗true L-arithmetic∗ is the theory of the standard L-model.

    – Any subset of true L-arithmetic is a ∗theory of L-arithmetic∗.

                                       ∗∗

DEFINITION. Let L be a language of arithmetic and let φ(x) be an L-formula with
at most one free variable x. The ∗induction instance∗ for φ(x) is the L-sentence

      IND(φ) :≡ φ(0)∧∀x(φ(x)→φ(x+1))→∀x.φ(x).

                                       ∗∗

DEFINITION. Let L be a language of arithmetic and let T be an L-theory. Let φ(x)
and ψ(x) be L-formulas both with at most one free variable x. Say that ∗ψ
witnesses that T proves ∀x.φ(x) with and only with a non-analytic induction
hypothesis∗ if and only if

  (1) T,IND(φ) ⊬ ∀x.φ(x),

  (2) T ⊢ φ(0),

  (3) T ⊢ ψ(0),

  (4) T ⊢ ∀x,ψ(x)→ψ(x+1),

  (5) T ⊢ ∀x.ψ(x)→∀x.φ(x).

                                       ∗∗

DEFINITION. The ℒ[min]-theory PA⁻ is axiomatized by the following.

    – 0 and 1 are distinct:

            0 ≠ 1.

    – Associativity of addition and multiplication:

            (x+y)+z = x+(y+z),

            (x⋅y)⋅z = x⋅(y⋅z).

    – Commutativity of addition and multiplication:

            x+y = y+x,

            x⋅y = y⋅x.

    – Distributivity of addition over multiplication:

            x⋅(y+z) = x⋅y+x⋅z.

    – 0 is an additive identity and a multiplicative zero:

            x⋅0 = 0.

    – The order is irreflexive:

            x ≮ x.

    – The order is transitive:

            x < y ∧ y < z → x < z.

    – The order is total:

            x < y ∨ x = y ∨ y < x.

    – The order is discrete:

            x = 0 ∨ x = 1 ∨ 1 < x.

    – Addition and multiplication respect the order:

            x < y → x+z < y+z,

            x < y ∧ 0 ≠ z → x⋅z < y⋅z.

    – Smaller elements can be subtracted from larger:

            x < y → ∃z. x+z = y.

                                       ∗∗

FACT. For all ℒ[min]-structures M: M ⊨ PA⁻ if and only if M is the non-negative
part of a nontrivial discretely ordered commutative ring.

                                       ∗∗

FACT. PA⁻ is a theory of arithmetic.

                                       ∗∗

FACT. PA = PA⁻∪{IND(φ) : φ an ℒ[min]-formula with at most one free variable}

                                       ∗∗

§4 Fact 1 must be proved using a non-analytic induction hypothesis
──────────────────────────────────────────────────────────────────

DEFINITION. ℤ[X] := ⟨ℤ[X],0,1,+,⋅,<⟩ is the ordered ring of polynomials with
coefficients in ℤ.

                                       ∗∗

Elements of ℤ[X] are polynomials

      z₀+z₁X+z₂X²+⋅⋅⋅+zₙXⁿ

with z₀,z₁,...,zₙ in ℤ and if n ≠ 0 then zₙ ≠ 0. These can be divided into the
∗constants∗ polynomials

      z  (z in ℤ)

and the ∗non-constant∗ polynomials

      z+pX  (p in ℤ[X], z in ℤ).

                                       ∗∗

Addition and multiplication in ℤ[X] are as expected. The order is given by

      z₀+z₁X+z₂X²+⋅⋅⋅+zₙXⁿ > 0  if and only if   zₙ > 0,

                         p > q  if and only if  p-q > 0.

                                       ∗∗

DEFINITION. ℤ[X]⁺ = ⟨ℤ[X]⁺,0,1,+,⋅,<⟩ is the non-negative part of ℤ[X], that is,
a polynomial p from ℤ[X] is in ℤ[X]⁺ if and only if p ≥ 0.

                                       ∗∗

Elements of ℤ[X]⁺ are polynomials

      z₀+z₁X+z₂X²+⋅⋅⋅+zₙXⁿ

with z₀,z₁,...,zₙ in ℤ and zₙ ≥ 0 and if n ≠ 0 then zₙ ≠ 0. The constant
polynomials are

      n  (n in ℕ)

and the non-constant polynomials are

      z+pX  (p in ℤ[X], z in ℤ, p > 0).

                                       ∗∗

FACT. ℤ[X]⁺ ⊨ PA⁻.

PROOF. ℤ[X]⁺ is the is the non-negative part of the nontrivial discretely
ordered commutative ring ℤ[X].                                                 □

                                       ∗∗

DEFINITION. The language of arithmetic L₁ is ℒ[min] expanded with the function
symbol f₁.

                                       ∗∗

DEFINITION. The theory of arithmetic T₁ is defined by

      T₁ := PA⁻, f₁(0) = 0, ∀x. f₁(x+1) = f₁(x)+2x+1.

                                       ∗∗

DEFINITION. The L-formulas φ₁(x) and ψ₁(x) are defined by

      φ₁(x) :≡ ∃y. f₁(x) = y⋅y,

      ψ₁(x) :≡ f₁(x) = x⋅x.

                                       ∗∗

FACT. ψ₁ witnesses that T₁ proves ∀x.φ₁(x) with and only with a non-analytic
induction hypothesis.

PROOF. We need to verify conditions (1)–(5).

  (2) Trivial.

  (3) Trivial.

  (4) The proof of Fact 1 given earlier can be straightforwardly formalized
      in PA⁻.

  (5) Trivial.

  (1) We exhibit an L₁-model satisfying T₁ and IND(φ₁) but not ∀x.φ₁(x). Let M
      be the L₁-expansion of ℤ[X]⁺ with f₁ interpreted as follows.

                     f₁ᴹ(0) := 0,

                   f₁ᴹ(n+1) := f₁ᴹ(n)+2n+1,

                  f₁ᴹ(pX-1) := pX²

            f₁ᴹ(pX-1+(n+1)) := f₁ᴹ(pX-1+n)+2(pX-1+n)+1,

            f₁ᴹ(pX-1-(n+1)) := f₁ᴹ(pX-1-n)-2(pX-1-n)+1.

      The right hand side of the last equation does indeed define a polynomial
      of ℤ[X]⁺ since by construction the degree of f₁ᴹ(z+pX) is greater than the
      degree of pX for all integers z and all p in ℤ[X]⁺. (We get the last
      equation by solving

            f₁(pX-1-n) = f₁(pX-1-(n+1)+1) = f₁(pX-1-(n+1))+2(pX-1-(n+1))+1

      for f₁(pX-1-(n+1)).)

      We then have

            f₁ᴹ(X-1) = X²

      which is a perfect square in M and we have

            f₁ᴹ(X) = f₁ᴹ(X-1)+2(X-1)+1
                   = X²+2X-1

      which is not a perfect square in M. Thus:

          – M ⊭ ∀x.φ₁(x) since M ⊭ φ₁(X).

          – M ⊨ IND(φ₁) since M ⊭ ∀x,φ₁(x)→φ₁(x+1) since M ⊨ φ₁(X) but
            M ⊭ φ₁(X+1).

      By construction we also have M ⊨ T₁ so we are done.                      □

                                       ∗∗

§5 Must Fact 2 be proved using a non-analytic induction hypothesis?
───────────────────────────────────────────────────────────────────

Since Fact 2 is a statement involving rationals, a little work is needed to
phrase it as a natural statement in first-order arithmetic (that is, without
involving any coding).

                                       ∗∗

DEFINITION. g : ℕ → ℕ and h : ℕ → ℕ are recursively defined by

        g(0) := 1,

      g(n+1) := (n+2)²g(n)+h(n),

        h(0) := 1,

      h(n+1) := (n+2)²h(n).

                                       ∗∗

We have

      f₂(n) = g(n)/h(n)

so the inequality f₂(n) < 2 can be rewritten as

      g(n) < 2h(n).

Similarly the inequality f₂(n) ≤ 2-1/(n+1) becomes

      (n+1)g(n) ≤ (2n+1)h(n).

                                       ∗∗

All in all, we have the following rephrasings of Fact 2 and its proof attempt
and proof.

                                       ∗∗

LEMMA. For all natural numbers n, if (n+1)g(n) ≤ (2n+1)h(n) then g(n) < 2h(n).

PROOF. Suppose (n+1)g(n) ≤ (2n+1)h(n). Since (2n+1)h(n) < 2(n+1)h(n) we then
have (n+1)g(n) < 2(n+1)h(n) so we must have g(n) < 2h(n).                      □

                                       ∗∗

FACT 2'. g(n) < 2h(n) for all n.

PROOF ATTEMPT. Induction on n.

    – Base case:

            g(0) < 2h(0) ≡ 1 < 2⋅1  (by definition)
                         ≡ 1 < 2.

    – Induction step:

            g(n+1) = (n+2)²g(n)+h(n)    (by definition)
                   < (n+2)²⋅2h(n)+h(n)  (by induction hypothesis)

      But for any n we have (n+2)²⋅2h(n)+h(n) ≮ 2h(n). Thus we are "stuck".

ACTUAL PROOF. By the lemma, it suffices to prove

      (n+1)g(n) ≤ (2n+1)h(n) for all n.

We prove this by induction on n.

    – Base case:

            (0+1)g(0) ≤ (2⋅0+1)h(0)  (by definition)
                      ≡ 1⋅1 ≤ 1⋅1
                      ≡ 1 ≤ 1.

    – Induction step: Our induction hypothesis is

       (IH) (n+1)g(n) ≤ (2n+1)h(n)

      and we must show

      (GOAL) (n+2)g(n+1) ≤ (2n+3)h(n+1).

      We have

            (n+2)g(n+1) = (n+2)((n+2)²g(n)+h(n))
                        = (n+2)³g(n)+(n+2)h(n)

      and

            (2n+3)h(n+1) = (2n+3)(n+2)²h(n)
                         = (2n+3)(n+2)(n+2)h(n)
                         = (2n²+7n+6)(n+2)h(n)
                         = (2n²+7n+5)(n+2)h(n)+(n+2)h(n)

      so (GOAL) is equivalent to

            (n+2)³g(n) ≤ (2n²+7n+5)(n+2)h(n),

      which is equivalent to

            (n+2)²g(n) ≤ (2n²+7n+5)h(n),

      which we have by

            (n+2)²g(n) = ((n+1)+1)²g(n)
                       = ((n+1)²+2(n+1)+1)g(n)
                       = ((n+1)²+2(n+1))g(n)+g(n)
                       = (n+3)(n+1)g(n)+g(n)
                       ≤ (n+3)(2n+1)h(n)+g(n)      (by (IH))
                       ≤ (n+3)(2n+1)h(n)+2h(n)     (by Lemma and (IH))
                       = (2n²+7n+5)h(n).                                       □

                                       ∗∗

DEFINITION. The language of arithmetic L₂ is ℒ[min] expanded with the function
symbols g and h.

                                       ∗∗

DEFINITION. The L₂-sentences DEF(g) and DEF(h) and the L₂-formulas φ₂(x) and
ψ₂(x) are defined by

      DEF(g) :≡ g(0) = 1 ∧ ∀x. g(x+1) = (x+2)²⋅g(x)+h(x)

      DEF(h) :≡ h(0) = 1 ∧ ∀x. h(x+1) = (x+2)²⋅h(x),

       φ₂(x) :≡ g(x) < 2⋅h(x),

       ψ₂(x) :≡ (x+1)⋅g(x)+h(x) ≤ 2⋅(x+1)⋅h(x).

                                       ∗∗

CONJECTURE. There is an L₂-theory of arithmetic T ⊇ PA⁻ such that

  (1) T ⊢ DEF(g),

  (2) T ⊢ DEF(h),

  (3) ψ₂ witnesses that T proves ∀x.φ₂(x) with and only with a non-analytic
      induction hypothesis.

                                       ∗∗

Adapting the proof of the previous section——cleverly interpreting the new
function symbols on ℤ[X]⁺——will not work to settle the above conjecture, as the
following results shows.

                                       ∗∗

LEMMA. Let M be an L₂-expansion of ℤ[X]⁺ such that M ⊨ DEF(h). Let p be a
non-zero polynomial and let z be an integer. If hᴹ(pX+z) ≠ 0 then hᴹ(pX+z-1) ≠ 0
and degree(hᴹ(pX+z)) > degree(hᴹ(pX+z-1)).

PROOF. We have

      hᴹ(pX+z) = (pX+z+1)²hᴹ(pX+z-1)

since M ⊨ DEF(h) so clearly if hᴹ(pX+z) ≠ 0 then hᴹ(pX+z-1) ≠ 0 and
degree(hᴹ(pX+z)) > degree(hᴹ(pX+z-1)).                                         □

                                       ∗∗

FACT. Let M be an L₂-expansion of ℤ[X]⁺. If M ⊨ DEF(h) then h(p) = 0 for all
non-constant polynomials p.

PROOF. Suppose h(pX+z) ≠ 0 for some non-constant polynomial pX+z. By the above
lemma we then have the contradiction that we have an infinite descending
chain

      degree(hᴹ(pX+z)) > degree(hᴹ(pX+z-1))
                       > degree(hᴹ(pX+z-2))
                       > degree(hᴹ(pX+z-3))
                       ⋮                                                       □

                                       ∗∗

Note that the above results should generalize to any polynomial ring R[X].
Perhaps one could settle the conjecture by considering some ring R[X,X⁻¹] of
Laurent polynomials.

                                       ∗∗

§5 Future work
──────────────

One line of future work would of course be to settle the conjecture in §4.

                                       ∗∗

In §3 we expanded ℤ[X]⁺ to a L₁-model in order to show that T₁ proves ∀x.φ₁(x)
with and only with a non-analytic induction hypothesis. For each sentence σ of
true L₁-arithmetic that is false in ℤ[X]⁺ it is natural to ask whether adding σ
to T₁ lets us prove ∀x.φ₁(x) without needing a non-analytic induction
hypothesis. One simple sentence of true L₁-arithmetic that is not true in ℤ[X]⁺
is "all numbers are odd or even", that is

      σ₁ :≡ ∀x∃y, x = y+y ∨ x = y+y+1.

                                       ∗∗

CONJECTURE. ψ₁ witnesses that T₁∪{σ₁} proves ∀x.φ₁(x) with and only with a
non-analytic induction hypothesis.

                                       ∗∗

To more systematically settling conjectures like the one above one could attempt
to establish more general results——instead of hand-crafting countermodels for
each particular case. We hope that the literature on first-order arithmetic
already contains lots of applicable results.

                                       ∗∗

Veryfing provability in weak fragments of arithmetic is hard. It is easy to
rely on something true that is not provable in the fragment one works with. Thus
it would be worthwhile to verify the provability statements in §3 with a theorem
prover.

                                       ∗∗

An interesting future line of work would be to consider other settings than
arithmetic. For example, in computer science, basic facts of operations on
inductive structures often seem to require a non-analytic induction hypothesis.

                                       ∗∗

One might also approach the problem of non-analytic induction proofs from the
more proof-theoretical side, for example by studying derivations in natural
deduction. Dag Prawitz's (2018) recent work may be useful.

                                       ∗∗

REFERENCES
──────────

Hetzl, Stefan and Tin Lok Wong (2017): "Some observations on the logical
foundations of inductive theorem proving", Logical Methods in Computer Science
13(4).

Prawitz, Dag (2018): "The concepts of proof and ground", preprint.