Writing for discussion

2025-04-20 18:03:21 -04:00 · 2025-04-20 18:03:21 -04:00 · f9a4db5a53
parent dc84c63523
commit f9a4db5a53
3 changed files with 172 additions and 10 deletions
--- a/cbpv.mng
+++ b/cbpv.mng
@ -11,7 +11,7 @@
 \usepackage[capitalize,nameinlink,noabbrev]{cleveref}
 \usepackage{mathpartir,mathtools,stmaryrd,latexsym}

-\setlist{nosep}
+\setlist{noitemsep}
 \hypersetup{
  colorlinks=true,
  urlcolor=blue,
@ -295,7 +295,7 @@ directly as a relation between a type and a term;
 this presentation is given in \cref{fig:lr-alt}.
 Unfortunately, this is not well defined,
 since the logical relation appears in a negative position
-(to the left of an implication) in the premise of \rref{LRPP-Arr}.
+(to the left of an implication) in the premise of \rref{LR'-Arr}.
 The alternate presentation can be interpreted as a function
 match on the term and the type and returning the conjunction of its premises,
 but I wanted to use a mutually defined inductive definition.
@ -324,7 +324,7 @@ determinism, which states that the interpretation indeed behaves like a function
 from types to sets of terms;
 and backward closure, which states that the interpretations of computation types
 are backward closed under multi-step head reduction.
-The last property is why \rref{LRPP-F-var} needs to include
+The last property is why \rref{LR'-F-var} needs to include
 computations that \emph{reduce to} strongly neutral terms or returns,
 not merely ones that are such terms.

@ -691,7 +691,7 @@ traditional strong normalization of well typed terms holds as a corollary.
  By induction on the derivation of $[[m ∈ SNe]]$.
 \end{proof}

-\begin{theorem}[Soundness (\textsf{SN})] \leavevmode
+\begin{theorem}[Soundness (\textsf{SN})] \label{thm:soundness-sn} \leavevmode
  \begin{enumerate}
    \item If $[[m ∈ SNe]]$ then $[[m ∈ sn]]$.
    \item If $[[v ∈ SN]]$ then $[[v ∈ sn]]$.
@ -716,4 +716,85 @@ traditional strong normalization of well typed terms holds as a corollary.

 \section{Discussion} \label{sec:discussion}

+Overall, the inductive characterization of strongly normal terms is convenient to work with.
+Its congruence and inversion properties are exactly what are needed
+to show \nameref{lem:closure} and \nameref{lem:adequacy} of the logical relation,
+which are already built in by its inductive nature.
+Proving soundness itself relies mostly on the properties of the logical relation,
+along with a few more congruence properties of strong head reduction.
+Knowing the general recipe for constructing the inductive definition
+and for the structure of a proof by logical relations
+makes it easy to adapt the proof of strong normalization to simply typed CBPV.
+There are only a few pitfalls related to simply typed CBPV specifically:
+
+\begin{itemize}[rightmargin=\leftmargin]
+  \item I originally included a corresponding notion of reduction for values
+    in the mutual inductive which reduced under subterms.
+    This is unnecessary not only because values have no $\beta$-reductions,
+    but also because subterms of strongly normal values need to be strongly normal anyway,
+    so there is no need to reduce subterms.
+    Having reduction for values therefore probably characterizes the same set of strongly normal terms,
+    but made proofs unnecessarily difficult to complete.
+  \item Similarly, I originally included premises asserting other subterms of head reduction
+    must be strongly normal even if they appear in the reduced term on the right-hand side.
+    This changes what terms may reduce,
+    but not what is included in the set of strongly normal terms,
+    since \rref{SN-Red} requires that the right-hand term is strongly normal anyway.
+    The soundness proof still went through,
+    but this design is hard to justify by first principles,
+    and adds proof burden to the lemmas for the traditional presentation
+    of strongly normal terms later on.
+  \item The lack of a notion of reduction for values creates an asymmetry
+    that initially led me to omit the fact that the computations
+    in the interpretation of $[[F A]]$ in \rref{LR-F} must \emph{reduce}
+    to strongly neutral terms instead of merely \emph{being} strongly neutral.
+    This makes \nameref{lem:closure} impossible to prove,
+    since strongly neutral terms are not backward closed.
+\end{itemize}
+
+The inductive characterization is easy to work with
+because all the hard work seems to lie in showing it sound
+with respect to the traditional presentation of strongly normal terms.
+This proof took about as much time as the entire rest of the development.
+Not only does it require setting up an entire single-step reduction relation
+and proving all of its tedious congruence, renaming, substitution, and commutativity lemmas,
+the intermediate lemmas for \nameref{thm:soundness-sn} are long and tricky.
+In particular, the congruence, head expansion, and backward closure lemmas
+require double or even triple induction.
+\Rref{sn-App-bwd,sn-Let-bwd} are especially tricky,
+since their double inductions are on two auxiliary derivations
+produced from one of the actual premises,
+and that premise unintuitively needs to stick around over the course of the inductions.
+Without guidance from the POPLMark Reloaded paper,
+I would have been stuck on these two particular lemmas for much longer.
+
+The purpose of all the work done in \cref{sec:sn-acc}
+is to show that $\kw{sn}$ satisfies properties that
+otherwise hold by definition for $\kw{SN}$ as a constructor.
+If proving that the latter is sound with respect to the former
+already requires all this work showing that the former behaves exactly like the latter,
+then what is the point of using the inductive characterization?
+All constructions of $\kw{SN}$ could be directly replaced
+by the congruence and backward closure lemmas we have proven for $\kw{sn}$.
+
+Using the inductive characterization is beneficial only if we don't care
+about its soundness with respect to the traditional presentation,
+where strong normalization is not the end goal.
+For example, in the metatheory of dependent type theory,
+we care about normalization because we want to know
+that the type theory is consistent,
+and that definitional equality is decidable
+so that a type checker can actually be implemented.
+For the latter purpose,
+all we require is \emph{weak normalization}:
+that there exists \emph{some} reduction strategy that reduces terms to normal form.
+The shape of the inductive characterization makes it easy to show
+that a leftmost-outermost reduction strategy does so (\texttt{LeftmostOutermost.lean}).
+While weak normalization can be proven directly using the logical relation,
+generalizing to the inductive strong normalization is actually easier,
+in addition to being more modular and not tying the logical relation
+to the particularities of the chosen reduction strategy.
+
+\subsection{Lean for PL Metatheory}
+
 \end{document}
--- a/cbpv.pdf
+++ b/cbpv.pdf
--- a/cbpv.tex
+++ b/cbpv.tex
@ -11,7 +11,7 @@
 \usepackage[capitalize,nameinlink,noabbrev]{cleveref}
 \usepackage{mathpartir,mathtools,stmaryrd,latexsym}

-\setlist{nosep}
+\setlist{noitemsep}
 \hypersetup{
  colorlinks=true,
  urlcolor=blue,
@ -295,7 +295,7 @@ directly as a relation between a type and a term;
 this presentation is given in \cref{fig:lr-alt}.
 Unfortunately, this is not well defined,
 since the logical relation appears in a negative position
-(to the left of an implication) in the premise of \rref{LRPP-Arr}.
+(to the left of an implication) in the premise of \rref{LR'-Arr}.
 The alternate presentation can be interpreted as a function
 match on the term and the type and returning the conjunction of its premises,
 but I wanted to use a mutually defined inductive definition.
@ -324,7 +324,7 @@ determinism, which states that the interpretation indeed behaves like a function
 from types to sets of terms;
 and backward closure, which states that the interpretations of computation types
 are backward closed under multi-step head reduction.
-The last property is why \rref{LRPP-F-var} needs to include
+The last property is why \rref{LR'-F-var} needs to include
 computations that \emph{reduce to} strongly neutral terms or returns,
 not merely ones that are such terms.

@ -691,7 +691,7 @@ traditional strong normalization of well typed terms holds as a corollary.
  By induction on the derivation of $ \ottnt{m}  \in \kw{SNe} $.
 \end{proof}

-\begin{theorem}[Soundness (\textsf{SN})] \leavevmode
+\begin{theorem}[Soundness (\textsf{SN})] \label{thm:soundness-sn} \leavevmode
  \begin{enumerate}
    \item If $ \ottnt{m}  \in \kw{SNe} $ then $ \ottnt{m}  \in \kw{sn} $.
    \item If $ \ottnt{v}  \in \kw{SN} $ then $ \ottnt{v}  \in \kw{sn} $.
@ -716,4 +716,85 @@ traditional strong normalization of well typed terms holds as a corollary.

 \section{Discussion} \label{sec:discussion}

+Overall, the inductive characterization of strongly normal terms is convenient to work with.
+Its congruence and inversion properties are exactly what are needed
+to show \nameref{lem:closure} and \nameref{lem:adequacy} of the logical relation,
+which are already built in by its inductive nature.
+Proving soundness itself relies mostly on the properties of the logical relation,
+along with a few more congruence properties of strong head reduction.
+Knowing the general recipe for constructing the inductive definition
+and for the structure of a proof by logical relations
+makes it easy to adapt the proof of strong normalization to simply typed CBPV.
+There are only a few pitfalls related to simply typed CBPV specifically:
+
+\begin{itemize}[rightmargin=\leftmargin]
+  \item I originally included a corresponding notion of reduction for values
+    in the mutual inductive which reduced under subterms.
+    This is unnecessary not only because values have no $\beta$-reductions,
+    but also because subterms of strongly normal values need to be strongly normal anyway,
+    so there is no need to reduce subterms.
+    Having reduction for values therefore probably characterizes the same set of strongly normal terms,
+    but made proofs unnecessarily difficult to complete.
+  \item Similarly, I originally included premises asserting other subterms of head reduction
+    must be strongly normal even if they appear in the reduced term on the right-hand side.
+    This changes what terms may reduce,
+    but not what is included in the set of strongly normal terms,
+    since \rref{SN-Red} requires that the right-hand term is strongly normal anyway.
+    The soundness proof still went through,
+    but this design is hard to justify by first principles,
+    and adds proof burden to the lemmas for the traditional presentation
+    of strongly normal terms later on.
+  \item The lack of a notion of reduction for values creates an asymmetry
+    that initially led me to omit the fact that the computations
+    in the interpretation of $ \kw{F} \gap  \ottnt{A} $ in \rref{LR-F} must \emph{reduce}
+    to strongly neutral terms instead of merely \emph{being} strongly neutral.
+    This makes \nameref{lem:closure} impossible to prove,
+    since strongly neutral terms are not backward closed.
+\end{itemize}
+
+The inductive characterization is easy to work with
+because all the hard work seems to lie in showing it sound
+with respect to the traditional presentation of strongly normal terms.
+This proof took about as much time as the entire rest of the development.
+Not only does it require setting up an entire single-step reduction relation
+and proving all of its tedious congruence, renaming, substitution, and commutativity lemmas,
+the intermediate lemmas for \nameref{thm:soundness-sn} are long and tricky.
+In particular, the congruence, head expansion, and backward closure lemmas
+require double or even triple induction.
+\Rref{sn-App-bwd,sn-Let-bwd} are especially tricky,
+since their double inductions are on two auxiliary derivations
+produced from one of the actual premises,
+and that premise unintuitively needs to stick around over the course of the inductions.
+Without guidance from the POPLMark Reloaded paper,
+I would have been stuck on these two particular lemmas for much longer.
+
+The purpose of all the work done in \cref{sec:sn-acc}
+is to show that $\kw{sn}$ satisfies properties that
+otherwise hold by definition for $\kw{SN}$ as a constructor.
+If proving that the latter is sound with respect to the former
+already requires all this work showing that the former behaves exactly like the latter,
+then what is the point of using the inductive characterization?
+All constructions of $\kw{SN}$ could be directly replaced
+by the congruence and backward closure lemmas we have proven for $\kw{sn}$.
+
+Using the inductive characterization is beneficial only if we don't care
+about its soundness with respect to the traditional presentation,
+where strong normalization is not the end goal.
+For example, in the metatheory of dependent type theory,
+we care about normalization because we want to know
+that the type theory is consistent,
+and that definitional equality is decidable
+so that a type checker can actually be implemented.
+For the latter purpose,
+all we require is \emph{weak normalization}:
+that there exists \emph{some} reduction strategy that reduces terms to normal form.
+The shape of the inductive characterization makes it easy to show
+that a leftmost-outermost reduction strategy does so (\texttt{LeftmostOutermost.lean}).
+While weak normalization can be proven directly using the logical relation,
+generalizing to the inductive strong normalization is actually easier,
+in addition to being more modular and not tying the logical relation
+to the particularities of the chosen reduction strategy.
+
+\subsection{Lean for PL Metatheory}
+
 \end{document}