Logical reasoning is a core method of obtaining knowledge. Whatever its relationship to empirical knowledge and to our broader philosophical commitments, we should at least understand what sort of reasoning logic is, and how it should be integrated into our knowledge seeking methods. We must therefore investigate the foundations of logic, which consist of at least two ingredients: (i) axioms, and (ii) inference rules. Axioms are foundational premises that we deduce conclusions from, and inference rules describe how exactly we may combine premises to get conclusions. Consider the classical syllogism:
All men are mortal
Socrates is a man
Therefore, Socrates is mortal.
In this argument, (1) and (2) are premises, we arrive at the conclusion (3) via the inference rule of modus ponens. This inference rule states that we may combine “If P then Q” and “P” to conclude “Q”. But why should we believe in the premises (1) and (2)? And why is modus ponens a valid inference rule? We can offer arguments to answer both questions, but these arguments themselves will have some logical structure consisting of premises and inference rules. At some point we will hit rock bottom and require foundational premises, or axioms.
In this note, I will focus on the issue of axiom selection for deductive systems. Much of the discussion will apply to inference rules as well, but I will focus on axiom selection for concreteness and simplicity. Let’s lay out the core question:
Axiom Selection: When building a system of deductive reasoning, how should we choose the axioms of that system?
By framing the question in this way, I am deliberately ignoring some philosophical issues. For example, I have not stated a paradigm in which to understand axioms, inference rules, and deductions. Should we, for example, really think of logic as beginning from axioms and proceeding towards deductions? Not everyone agrees with this. For example, the SEP compares two paradigms for resolving the infinite regress problem when justifying the claim “X is F”.
“Foundationalism: To halt the regress by taking there to be a foundation—a set of X’s whose F-ness is taken to be basic, and by which we can account for the F-ness of all other X’s.
or:
Coherentism: To resist an infinite regress by allowing a circular or holistic explanation of the F-ness of at least some X’s.” (Emphasis mine).
Another issue is the epistemic meaning of axioms. When we choose axioms, should we be Pragmatists and care about their usefulness, or do we seek axioms that are “actually true” or “actually fundamental”?
I will not resolve either issue in this note, but I will attempt to interweave these questions into a more concrete discussion of axiom selection. My hope is to both understand methods for axiom selection as they occur in practice, and clarify what good axiom selection ought to look like in theory.
The rest of this note is as follows. First, I will discuss two methods of axiom selection, called “Second Philosophy” and “Quinean holism.” Next, I will propose three desiderata for axiom selection methods, based on the argument for both positions. Finally, I will investigate the set-theoretic Axiom of Choice as a case study, and give some concluding thoughts.
Position 1: Second philosophy
In her book Defending the Axioms, P. Maddy presents the “Second Philosopher” as one who prioritizes scientific inquiry first, and philosophy second. The Second Philosopher goes about her inquiries into atoms, molecules, and so forth with whatever methods seem to make sense, and when she does encounter philosophical issues she is interested in them insofar as they have something to say about her methods. The following passage from Maddy illustrates the point.
“van Fraassen holds that familiar experimental evidence establishes the existence of unobservables (like atoms) for scientific purposes, but that, from a philosophical or epistemic stand- point, such beliefs can never be justified. Faced with such a challenge, our inquirer is simply baffled: all her good evidence has been declared irrelevant, ‘merely scientific’; she’s asked to justify her belief in atoms on other grounds for unfamiliar purposes, but she’s given no working understanding of what those purposes are and what methods are appropriate in their service. Philosophy undertaken in such complete isolation from science and common sense is often called ‘First Philosophy’, so I call our inquirer a Second Philosopher.” (Emphasis mine).
From this point of view, science and mathematics do not develop after philosophers have bequeathed the correct metaphysics onto the rest of us; rather, they proceed in concert with, and somewhat independently of, more abstract considerations. Indeed, this seems to match historical experience. We often discover facts about the world without a fully rigorous framework in which to interpret them.
For example, a rigorous understanding of thermodynamics in physics (e.g. by Carnot) came only after the first steam engines were introduced and used. We can imagine interrogating a boiler room operator in the 1820s with questions like “Do you know that the turning of one gear will cause its neighbor to turn?” or “Can you explain how the burning of coal turns water into pressurized steam?” To a second philosopher, these questions should not be cause for concern if the operator has been successfully making the train run for years.
Similarly, for the Second Philosopher, axiom selection should proceed along “normal” grounds as much as possible. In non-foundational mathematics (that is, outside of set theory and logic) we in fact routinely select axioms to capture abstract objects of interest (e.g. vector spaces). Selecting these axioms is something of an art form, but usually one looks for axioms that are actually satisfied by many concrete examples, describe an interesting phenomenon, and get at the essence of the thing itself.
For the Second Philosopher, axiom selection should proceed along “normal” grounds as much as possible.
How do we know that we are getting to the essence of the thing? One obvious method is to look at the deductive consequences of the axioms, and this is indeed what Gödel himself argues for in a 1964 paper:
“[T]here may exist, besides the usual axioms…axioms of set theory which a more profound understanding of the concepts underlying logic and mathematics would enable us to recognize as implied by these concepts…There might exist axioms so abundant in their verifiable consequences, shedding so much light upon a whole field, and yielding such powerful methods for solving problems...that, no matter whether or not they are intrinsically necessary, they would have to be accepted at least in the same sense as any well-established physical theory.” (Emphasis mine).
But this position still leaves us uneasy. Why should we accept new axioms simply because they are useful towards proving some theorems, or “feel” intuitive to us? Common sense is often horribly wrong. We need to go further and ask whether the usual methods of axiom selection – examining the consequences of an axiom, seeing if it sheds light on a field, can solve problems well, and so on – actually lead to correct conclusions.
Position 2: Quinean holism
Let us now turn to a method closer in spirit to the SEP’s coherentism. In “Two Dogmas of Empiricism,” Quine outlines a holistic picture of knowledge:
“[W]hat I am now urging is that even in taking the statement as unit we have drawn our grid too finely. The unit of empirical significance is the whole of science. The totality of our so-called knowledge or beliefs, from the most casual matters of geography and history to the profoundest laws of atomic physics or even of pure mathematics and logic, is a man- made fabric which impinges on experience only along the edges…Re-evaluation of some statements entails re-evaluation of others, because of their logical interconnections -- the logical laws being in turn simply certain further statements of the system.” (Emphasis mine).
We can view Coherentism as strictly separate from Foundationalism, but I argue that we can just as well consider coherentist reasons for seeking axioms within a Foundationalist program. Whatever the label, we ultimately want to find the right methods for axiom selection.
One consistent issue is that absent definitive rules to follow, we need to exercise some judgment. Quine emphasizes that in his system, one can always force certain statements to be true or false by fiddling with assumptions or interpretations at other points in the web. Quoting from “Two Dogmas”:
“But the total field is so undetermined by its boundary conditions, experience, that there is much latitude of choice as to what statements to re-evaluate in the light of any single contrary experience…[N]o statement is immune to revision. Revision even of the logical law of the excluded middle has been proposed as a means of simplifying quantum mechanics; and what difference is there in principle between such a shift and the shift whereby Kepler superseded Ptolemy, or Einstein Newton, or Darwin Aristotle? (Emphasis mine).
Unlike Second Philosophy, Quine is perfectly willing to use empirical evidence to decide on new axioms. The Second Philosopher instead leans on existing methods, asking “what methods do Set Theorists tend to use and trust?” These are of a more abstract nature and don’t reference physics much, simply because set theory has diverged quite heavily from any empirical science today.
Again though, we seem to be importing new assumptions to decide on axioms. Why should we give empirical knowledge a privileged status in deductive reasoning? We started this discussion by noting that deductive reasoning seems special and distinct from inductive reasoning, but now have fallen back on induction anyways. In fact, coherentism seems to be making the more radical claim that all claims of knowledge can be called into question by one another. There is no safe harbor in this web, only nodes that are well-enmeshed within clusters of facts, and those further out on the periphery.
All claims of knowledge can be called into question by one another. There is no safe harbor in this web, only nodes that are well-enmeshed within clusters of facts, and those further out on the periphery.
Of course, there are many ways to approach holism. Perhaps we can declare a heuristic that deductive reasoning ought to be considered generally more reliable because it does not require empirical evidence (aside from the pesky issue of axiom selection). But to me, it seems that we are still losing one of the best features of deduction, which is its certainty. 2+2=4 is not a probabilistic statement; why demote it to such a lowly status?
Taking stock: Desiderata for Axiom Selection
Clearly, any method of axiom selection must include external information of some kind. As we’ve seen, one can appeal to the logical fruitfulness of axioms (as in Maddy) or their integration into a broader web of knowledge (as in Quine). Both of these are not entirely satisfactory resolutions to the infinite regress problem, but I think both correctly capture the need to put axiom selection on the same epistemic footing as more ordinary knowledge seeking questions. Let’s lay out some reasonable methods that we’ve seen so far.
(i) Appeal to external information: Axiom selection should involve the integration of “knowledge” from various sources, even if that knowledge is dependent on assumptions.
(ii) Examination of consequences: We should attempt to thoroughly understand the consequences of accepting or rejecting an axiom before making a decision. In particular, we should consider how it complements or clashes with existing knowledge.
(iii) Preservation of privileged status for deduction: Axiom selection for deductive systems should give privileged status to deductive information, such as mathematical proofs of the mutual consistency or inconsistency of certain sets of axioms.
With these rough heuristics in hand, let’s turn to a case study in the foundations of mathematics.
Case Study: The Axiom of Choice
In the early 20th century, mathematicians introduced several competing formalizations of mathematical foundations. The one that ended up winning out is now called the ZF axioms for set theory, named after Zermelo and Fraenkel. These axioms formalize concepts like “set” and “infinity”. All of mathematics can be derived from them – but there is a catch. In addition to the axioms of ZF, we must also include the more controversial Axiom of Choice (AC), and so end up with the overall set of axioms “ZFC,” for “Zermelo-Fraenkel with Choice.” AC turns out to be logically independent of the other axioms, so this is a genuine choice; we could just as well study “ZF with not-C”, but in practice we choose to go with ZFC. Why?
The axiom itself seems innocuous enough. Bertrand Russel’s informal presentation of AC is as follows:
Axiom of Choice, informal: Given an infinite collection of pairs of shoes, it is clear that we can select the left shoe from each pair to form an infinite collection of left shoes. AC says that we can do the same for an infinite collection of sock pairs, despite a pair of socks having no clear “left” or “right” sock.
More formally, AC can be stated:
Axiom of Choice (AC): Given a set X consisting of sets x1, x2, …, where each set has some number of atoms from a universe U, there is a “choice” function
F: X —> U
Such that the atom F(x1) belongs to the set x1, the atom F(x2) belongs to the set x2, and so on.
Considering our heuristics for axiom selection, we can examine consequences of AC and of its negation; these are deductive consequences, so they also satisfy our desideratum (iii) by not appealing to empirical arguments. Still, we see a mixed story: AC is essential to proving some results, but also has counterintuitive consequences. Quoting the SEP,
“As the debate concerning the Axiom of Choice rumbled on, it became apparent that the proofs of a number of significant mathematical theorems made essential use of it, thereby leading many mathematicians to treat it as an indispensable tool of their trade…
Although the usefulness of AC quickly become [sic] clear, doubts about its soundness remained. These doubts were reinforced by the fact that it had certain strikingly counterintuitive consequences. The most spectacular of these was Banach and Tarski’s paradoxical decompositions of the sphere(Banach and Tarski 1924): any solid sphere can be split into finitely many pieces which can be reassembled to form two solid spheres of the same size; and any solid sphere can be split into finitely many pieces in such a way as to enable them to be reassembled to form a solid sphere of arbitrary size.” (Emphasis mine).
Examples of the consequences of both AC and its negation abound – this Stack Exchange thread gathers several interesting ones. One striking consequence of AC is that there are certain sets of numbers that have no consistent notion of “length,” which we call “non-measurable.” For example, the interval [0,3] has measure 3, and the interval [-2.5, -1.5] has measure 1. If we assume AC is false, all sets are measurable. But if we accept AC, we can now construct sets which cannot have any coherent “length” or “measure.”
Mathematicians today find the existence of non-measurable sets acceptable, but Kevin Buzzard in the same thread argues that this should bother us:
The fact that there exist non-measurable sets is highly counter-intuitive; the reason we don't find it so is that we've all been conditioned from day 1 to do measure theory very carefully, and define Borel sets, measurable sets, etc, so we all know that non-measurable sets exist because what would be the point of doing it all so carefully otherwise. At high school we were all taught that the probability of an event occurring was "do it a million times, count how often it happened, divide by a million, and now let a million tend to infinity". And no-one thought to ask "what if this process doesn't tend to a limit?". I bet if anyone asked their teacher they'd say "well it always tends to a limit, that's intuitively clear". (Emphasis mine).
What should and shouldn’t we consider intuitive? The details of these debates are surprisingly hand-wavey and full of intuition pumps. But we are not continental philosophers! We prove rigorous theorems regarding the consequences of axioms first, and then we hand-wave and intuition pump.
Figure: The Banach-Tarski paradox shows that, if we assume the Axiom of Choice, one can decompose a sphere into two spheres of the same size. Image due to Wikipedia.
Regarding non-measurability, let me offer an intuition pump that is hopefully reasonable and not too technical. The Banach-Tarski paradox says you can make two spheres from one. But the decomposition is not physically meaningful. It requires selecting sets of points with impossibly high control and rearranging them in precisely the way needed to get two new spheres; to realize this decomposition with physical spheres seems completely impossible. Therefore, despite following from AC, Banach-Tarski is not really telling us anything about physical spheres that we encounter in daily life. It is about very specific set-theoretic constructions which, while still true, do not really correspond to objects of daily experience.
To reinforce the point, here is an analogous decomposition of an infinite set. We can “decompose” the positive integers {1, 2, 3, …} into two “copies” of themselves and obtain the full integers {0, 1, -1, 2, -2, …}. Simply number the integers:
1 → 0
2 → 1
3 → -1
4 → 2
And so on. Even though the positive integers are a subset of the overall integers, they turn out to be the same size. Once we see the argument, we learn that our initial intuition was flawed, and accept that infinities can be decomposed and put back together in surprising ways. In the case of Banach-Tarski, the geometric picture may obscure this fact, but the unintuitive infinities are still there.
This discussion of paradoxes and Axiom of Choice illustrates how I feel we should approach concrete axiom selection questions. We can formally deduce consequences of AC, investigate their meaning, and then step back and ask if these seem reasonable or if they should ring alarm bells. In doing so, we attempt to carefully understand AC’s role in the broader edifice of mathematical knowledge, and tease out the extent to which it can be separated from other parts of our “web” of knowledge (or tower, if you are a Foundationalist). For example, proving that AC is independent of the rest of the set theory axioms was a big achievement in this respect, because it allowed us to investigate AC while continuing to study consequences of ZF which do not depend on it.
Of course, these considerations should apply not just to AC but the entire project of axiom selection in set theory and deductive systems. Today, the mathematical community is considering new candidate axioms against the body of knowledge we have already built. This debate is ongoing and will take many decades, as we investigate consequences of new axioms and their relationships to one another. For example, a recent breakthrough paper proved that Martin’s ++ axiom implies Woodin’s axiom. Neither axiom is currently part of the standard ZFC list, but results like these matter precisely because they help inform our axiom selection. With enough evidence, our descendants may one day accept Martin’s axiom with the same guarded confidence that we accept the Axiom of Choice today.
Conclusion: Towards a philosophy of axiom selection
Interesting problems rarely have algorithmic solutions. For a problem as monumental and difficult as axiom selection, I expect the situation to be no different. If you are a starry-eyed Platonist like Gödel (and me), you believe that there is a single set of axioms out there to be discovered. But the road to their discovery is not likely to be straightforward. Indeed, if the true axioms are incompressible information in the sense of Kolmogorov and Chaitin, then even knowing 1000 true axioms will not make it any easier to discover the 1001st.
Logical foundations is an area of philosophy full of confusion, guesswork, and several dead ends. We may despair of ever finding firm footing. Despite this enormous difficulty, humanity has been extraordinarily successful in discovering knowledge, both empirical and deductive. Indeed, one reason we should be so confident in the logical methods of mathematics is that the discipline remains unreasonably effective in the natural sciences. Taking this as our encouragement, we should proceed with the full faith and confidence that the correct axioms are out there, waiting to be discovered. I will conclude with some remarks from Gödel.
“[D]espite their remoteness from sense experience, we do have something like a perception also of the objects of set theory, as is seen from the fact that the axioms force themselves upon us as being true. I don't see any reason why we should have less confidence in this kind of perception, i.e., in mathematical intuition, than in sense perception, which induces us to build up physical theories…The set-theoretical paradoxes are hardly any more troublesome for mathematics than deceptions of the senses are for physics. That new mathematical intuitions leading to a decision of such problems as Cantor's continuum hypothesis are perfectly possible was pointed out earlier (pages 264.-265).
It should be noted that mathematical intuition need not be conceived of as a faculty giving an immediate knowledge of the objects concerned. Rather it seems that, as in the case of physical experience, we form our ideas also of those objects on the basis of something else which is immediately given. Only this something else here is not, or not primarily, the sensations. That something besides the sensations actually is immediately given follows (independently of mathematics) from the fact that even our ideas referring to physical objects contain constituents qualitatively different from sensations or mere combinations of sensations, e.g., the idea of the object itself…It by no means follows, however, that the data of this second kind, because they cannot be associated with actions of certain things upon our sense organs, are something purely subjective, as Kant asserted. Rather they, too, may represent an aspect of objective reality, but, as opposed to the sensations, their presence in us may be due to another kind of relationship between ourselves and reality.” (Emphasis mine).
Figure: Kurt Gödel (left), with a colleague from the physics department.
what a nerd