Term substitution and substitution lemma
From Learning Logic for Computer Science
Free occurrences of variables in first-order logic can be replaced meaningful by terms.
Contents
Introduction
The statement '$6$ is not a square' is true in the natural numbers. Since $6$ equals the value of $2+4$, the value of $2+4$ is not a square. Now suppose we had a formal statement with a free variable $x_0$ which expresses that $x_0$ is not a square. Then it is true under the assignment which maps $x_0$ to $6$. One would expect that it is also true when $x_0$ is replaced by $x_2 + x_3$ and under an assignment which maps $x_2$ to $2$ and $x_3$ to $4$.
This is exactly what term substitution is supposed to do: to mimic the effect of evaluating a term in the syntax.
Example Let $\mathcal N$ be the $\{\dot{+}/\!/2, \dot{\cdot}/\!/2\}$-structure with $N$ being the set of natural numbers and $\dot{+}^{\mathcal N}$ and $\dot{\cdot}^{\mathcal N}$ being natural addition and multiplication, respectively. Let $\varphi$ be defined by \begin{align} \varphi & = \forall x_1 \neg x_0 \doteq \dot{\cdot}(x_1, x_1) \enspace. \end{align} Then, clearly, \begin{align} \mathcal N, \{x_0 \mapsto 6\} & \mymodels \varphi \enspace. \end{align} We argued above that we expect the following to follow from this: \begin{align} \mathcal N, \{x_2 \mapsto 2, x_3 \mapsto 4\} & \mymodels \forall x_1 \neg \dot{+}(x_2, x_3) \doteq \dot{\cdot}(x_1, x_1) \enspace. \end{align} In fact, this is true.
So term substitution is supposed to have the following quality: when one thinks of a value for a variable as being the result of evaluating a term, then one wants the result of the substitution of the variable by the term to express the same as before but now for the term.
The basic idea for implementing term substitution is to replace the free occurrences of the variable of interest by the term in question. However, problems may arise due to name clashes.
Example The number $6$ can be written as $3 + 3$, so it is the value of $\dot{+}(x_1, x_1)$ under the assignment $\{x_1 \mapsto 3\}$. But because of a name clash the substitution of $x_0$ does not express what we want: \begin{align} \mathcal N, \{x_1 \mapsto 3\} & \not\mymodels \forall x_1 \neg \dot{+}(x_1, x_1) \doteq \dot{\cdot}(x_1, x_1) \enspace. \end{align}
The formal definition of substitution does not avoid name clashes. This is why the substituion lemma below has a side condition on name clashes.
Formal definition
Definition [term substitution] Let $\mathcal S$ be a signature. An $\mathcal S$-term substitution is a partial function $\{x_0, x_1, \dots \} \to T$, where $T$ is the set of $\mathcal S$-terms.
Definition [applying a substitution] Let $\mathcal S$ be a signature and $\sigma$ an $\mathcal S$-term substitution. In the following, it is defined what it means to apply $\sigma$ to an $\mathcal S$-term $t$ and an $\mathcal S$-formula $\varphi$, in symbols $t\sigma$ and $\varphi\sigma$, respectively. The result of the application is an $\mathcal S$-term and an $\mathcal S$-formula, respectively.
Application to terms:
Base mapping.
- $c\sigma = c$ for every $c \in \mathcal S$.
- $x_i\sigma = x_i$ for $x_i \notin \text{dom}(\sigma)$ and $x_i\sigma = \sigma(x_i)$ for $x_i \in \text{dom}(\sigma)$, for every $i \in \mathbf N$.
Inductive step.
- $f(t_0, \dots, t_{n-1})\sigma = f(t_0\sigma, \dots, t_{n-1}\sigma)$ for $f/\!/n \in \mathcal S$ and $\mathcal S$-terms $t_0, \dots, t_{n-1}$.
Application to formulas:
Base mapping.
- $\bot\sigma = \bot$ and $\top\sigma = \top$.
- $R(t_0, \dots, t_{n-1})\sigma = R(t_0\sigma, \dots, t_{n-1}\sigma)$ for $R{/}n \in \mathcal S$ and $\mathcal S$-terms $t_0, \dots, t_{n-1}$.
- $t_0 \doteq t_1\sigma = (t_0 \sigma) \doteq (t_1\sigma)$ for $\mathcal S$-terms $t_0, t_1$.
Inductive step.
- $C(\varphi_0, \dots, \varphi_{n-1})\sigma = C(\varphi_0\sigma, \dots, \varphi_{n-1}\sigma)$ for an $n$-ary connective $C$ and $\mathcal S$-formulas $\varphi_0, \dots, \varphi_{n-1}$.
- $Q x_i \varphi \sigma = Q x_i (\varphi \sigma')$ for $Q \in \{\exists, \forall\}$ where the difference between $\sigma$ and $\sigma'$ is that if $\sigma$ is defined for $x_i$, then $\sigma'$ is undefined for $x_i$.
Substitution lemma
The statement of the substitution lemma presupposes a formal definition of name clash.
Definition [name clash] Let $\varphi$ be a first-order formula and $\sigma$ a term substitution. There is a name clash in $\varphi$ for $\sigma$ if there exists some $x_i \in \text{dom}(\sigma)$ and some $x_j \in \text{vars}(\sigma(x_i))$ such that $x_i \in \text{scope}(x_j, \varphi)$.
This can also be rephrased using the notion of free term, where this is defined as follows.
Definition [free term] Let $\varphi$ be a first-order formula, $t$ a term and $i \in \mathbf N$. The term $t$ is free in $\varphi$ for $x_i$ if $x_i \notin \text{scope}(x_j, \varphi)$ for every $x_j \in \text{vars}(t)$.
So there is a name clash for $\sigma$ in $\varphi$ if for some $x_i \in \text{dom}(\sigma)$ the term $\sigma(x_i)$ is not free in $\varphi$ for $x_i$.
Substitution lemma Let $\mathcal S$ be a signature, $\varphi$ an $\mathcal S$-formula, $\sigma$ an $\mathcal S$-term substitution, $\mathcal A$ an $\mathcal S$-structure, and $\beta$ an $A$ assignment. Then \begin{align} \mathcal A, \beta & \mymodels \varphi\sigma \qquad \text{iff} \qquad \mathcal A, \beta^\sigma & \mymodels \varphi \qquad \text{if there is no name clash for $\sigma$ in $\varphi$.} \end{align} The effect of $\sigma$ on $\beta$, denoted $\beta^\sigma$, is the $A$-assignment defined by $\beta^\sigma(x_i) = \llbracket x_i\sigma\rrbracket_\beta^\mathcal A$.
Proof First of all, in the proof of the lemma a substitution lemma for terms is needed: \begin{align*} \llbracket t\sigma \rrbracket_\beta^\mathcal A & = \llbracket t \rrbracket_{\beta^\sigma}^\mathcal A \end{align*} for all $\mathcal S$-terms $t$.
A proof of the substitution lemma for formulas follows. It is by induction. For atomic formulas, the substitution lemma for terms is essential. For connectives, the proof is straightforward. The only interesting case is the case of quantifiers in the inductive step.
Let $\varphi$ be defined by $\varphi = Q x_i \psi$ for some first-order formula $\psi$ and $Q \in \{\exists, \forall\}$. Assume the lemma holds for $\psi$ instead of $\varphi$ and that there is no name clash for $\sigma$ in $\varphi$. Let $\sigma'$ be the same as $\sigma$ except that $\sigma'$ is not defined at $x_i$. Then there is no name clash for $\sigma'$ in $\psi$. (Formally, a short proof is required here.)
Now observe that for every $a \in A$ we have: \begin{align*} \beta\left[\frac{x_i}{a}\right]^{\sigma'} & = \beta^{\sigma'}\left[\frac{x_i}{a}\right] && \text{because $\sigma'$ is undefined at $x_i$}\\ & = \beta^{\sigma}\left[\frac{x_i}{a}\right] \enspace. \tag{*} && \text{because $\sigma$ and $\sigma'$ agree everywhere except at $x_i$} \end{align*}
We can conclude that, for every $a \in A$, the following is a chain of equivalent statements.
- $\mathcal A, \beta\left[\frac{x_i}{a}\right] \mymodels \psi \sigma'$.
- $\mathcal A, \beta\left[\frac{x_i}{a}\right]^{\sigma'} \mymodels \psi$, because of the inductive assumption and there is no name clash for $\sigma'$ in $\psi$ (see above).
- $\mathcal A, \beta^\sigma\left[\frac{x_i}{a}\right] \mymodels \psi$, because of (*).
From this, one obtains by the semantics of the existential quantifier and the definition of the application of a substitution: \begin{align*} \mathcal A, \beta & \mymodels Q x_i \psi \sigma \qquad\text{iff}\qquad \mathcal A, \beta^\sigma & \mymodels Q x_i \psi \enspace. \end{align*}
This concludes the proof.
Avoiding name clashes
A simple way to avoid name clashes is to rename bound variables.
Alternative approach
If one wants a simpler statement of the substitution lemma, without the side condition on name clashes, one can proceed as follows. One defines the application of a substitution to a formula in such a way that name clashes don't occur. This can be achieved by renaming of bound variables in the process of the application of the substitution.
Exercises
Exercise
- Geben Sie zu jeder der folgenden Substitutionen an, ob sich ein Namenskonflikt bei Anwendung auf die jeweilige Formel ergibt oder nicht. Begründen Sie Ihre Antwort, z.B. dadurch, dass Sie die entsprechenden Werte der Funktion $\text{scope}$ angeben.
- Geben Sie außerdem zu $\sigma_1$ und $\varphi_1$ eine passende Struktur $\mathcal A$ und Belegung $\beta$ als Gegenbeispiel für das Substitutionslemma ohne die geforderte Bedingung an.
- $\sigma_1=\{x_2 \mapsto g(x_1)\}$ und $\varphi_1=R(x_0,x_0) \wedge \forall x_1\, R(x_1, f(x_2))$
- $\sigma_2=\{x_0 \mapsto f(x_2), x_1 \mapsto g(f(x_2), x_0), x_2 \mapsto g(x_2, x_1)\}$ und $\varphi_2=R(x_0) \vee \exists x_1 (R(x_1) \wedge \forall x_1 S(x_1, x_2))$
- $\sigma_3=\{x_0 \mapsto x_1, x_1 \mapsto f(x_0), x_2 \mapsto g(f(x_2), x_1)\}$ und $\varphi_3=R(x_0,f(x_0))\wedge \exists x_0 (\exists x_1 (R(f(x_0),f(x_1)))) \wedge \exists x_0 (x_0\dot{=}x_2)$
- $\sigma_4=\{x_0 \mapsto x_1, x_1 \mapsto x_2, x_2 \mapsto g(f(x_2), x_0)\}$ und $\varphi_4=R(x_0,f(x_0))\wedge \exists x_0 (\exists x_1 (R(f(x_0),f(x_1)))) \wedge \exists x_0 (x_0\dot{=}x_2)$
- Bei der Substitution $\sigma_1=\{x_2 \mapsto g(x_1)\}$ gibt es bei der Formel $\varphi_1=R(x_0,x_0) \wedge \forall x_1\, R(x_1, f(x_2))$ einen Namenskonflikt, denn $x_2\in \operatorname{scope}(x_1,\varphi_1)$, $x_2\in\operatorname{dom}(\sigma_1)$ und $x_1\in\operatorname{vars}(\sigma_1(x_2))=\operatorname{vars}(g(x_1))$.
- Bei der Substitution $\sigma_2=\{x_0 \mapsto f(x_2), x_1 \mapsto g(f(x_2), x_0), x_2 \mapsto g(x_2, x_1)\}$ gibt es bei der Formel $\varphi_2=R(x_0) \vee \exists x_1 (R(x_1) \wedge \forall x_1 S(x_1, x_2))$ einen Namenskonflikt, denn $x_2\in \operatorname{scope}(x_1,\varphi_2)$, $x_2\in\operatorname{dom}(\sigma_2)$ und $x_1\in\operatorname{vars}(\sigma_2(x_2))=\operatorname{vars}(g(x_2, x_1))$.
- Bei der Substitution $\sigma_3=\{x_0 \mapsto x_1, x_1 \mapsto f(x_0), x_2 \mapsto g(f(x_2), x_1)\}$ gibt es bei der Formel $\varphi_3=R(x_0,f(x_0))\wedge \exists x_0\, (\exists x_1 (R(f(x_0),f(x_1)))) \wedge \exists x_0 (x_0\dot{=}x_2)$ keinen Namenskonflikt, denn die einzige Variable, die in einem Scope auftaucht ist $x_2$ mit $x_2\in\operatorname{scope}(x_0,\varphi_3)$ doch $x_0\not\in \operatorname{vars}(\sigma_3(x_2))=\operatorname{vars}(g(f(x_2), x_1))$.
- Bei der Substitution $\sigma_4=\{x_0 \mapsto x_1, x_1 \mapsto x_2, x_2 \mapsto g(f(x_2), x_0)\}$ gibt es bei der Formel $\varphi_4=R(x_0,f(x_0))\wedge \exists x_0\, (\exists x_1 (R(f(x_0),f(x_1)))) \wedge \exists x_0 (x_0\dot{=}x_2)$ einen Namenskonflikt, denn $x_2\in\operatorname{scope}(x_0,\varphi_4)$, $x_2\in\operatorname{dom}(\sigma_4)$ und $x_0\in\operatorname{vars}(\sigma_4(x_2))=\operatorname{vars}(g(f(x_2), x_0))$.
- Wir betrachten folgende Struktur $\mathcal A$ mit Trägermenge $A$ und Belegung $\beta$: \begin{align*} A &= \mathbb{N}\\ f^{\mathcal{A}}(n) &= n\text{ f.a. $n\in A$}\\ g^{\mathcal{A}}(n) &= n+1\text{ f.a. $n\in A$}\\ R^{\mathcal{A}}&=\{\langle n, 1\rangle\mid n\in A\}\\ \beta&=\{x_0\mapsto 1, x_1\mapsto 0, x_2\mapsto 1\} \end{align*} Die Struktur $\mathcal A$ mit Trägermenge $A$ und der folgenden Belegung $\beta^{\sigma_1}=\{x_0\mapsto \beta(x_0), x_1\mapsto \beta(x_1), x_2\mapsto g^{\mathcal A}(\beta(x_1))\}=\{x_0\mapsto 1, x_1\mapsto 0, x_2\mapsto 1\}$ ist ein Modell für $\varphi_1$ aber $\mathcal A,\beta\not\models\varphi_1\sigma_1$.