Skip to main content

Set Theory


Set theory is a mathematical theory of collections, "sets," and collecting, as governed by axioms. Part of its larger significance is that mathematics can be reduced to set theory, with sets doing the work of mathematical objects and their collections and set-theoretic axioms providing the basis for mathematical proofs. With this reduction in play, modern set theory has become an autonomous and sophisticated research field of mathematics, enormously successful at the continuing development of its historical heritage as well as at analyzing strong propositions and gauging their consistency strength.

Set theory arose in mathematics in the late nineteenth century as a theory of infinite collections and soon became intertwined with the development of analytic philosophy and mathematical logic. The subject was then developed as the logical distinction was being clarified between "falling under a concept," to be transmuted in set theory to "x y ", x is a member of y, and subordination or inclusion, to be transmuted in set theory to "x y ", x is a subset of y. That set theory is both a field of mathematics and serves as a foundation for mathematics emerged early in this development.

In what follows, set theory is presented as both a historical as well as an epistemological phenomenon, driven forward by mathematical problems, arguments, and procedures. The first part describes the groundbreaking work of Georg Cantor on infinite sets analyzed in terms of power, transfinite numbers, and well-orderings. The next two parts describe the subsequent transmutation of the notion of set through axiomatization, a process to be associated largely with Ernst Zermelo. Next will come a description of the work of Kurt Gödel on the constructible sets, work that made first-order logic central to set theory, followed by a description of the work of Paul Cohen on forcing, a method that transformed set theory into a modern, sophisticated field of mathematics. The last section describes the modern investigation of relative consistency in terms of forcing, large cardinals, and inner models.

Power, Number, and Well-Ordering

Set theory was born on that day in December 1873 when Cantor established that the continuum is not countable. The concepts here are fundamental: Taking infinite collections as unitary totalities, a set is countable if it is in one-to-one correspondence with the set of natural numbers {0, 1, 2, }, and the continuum is the linear continuum regarded extensionally as a collection of points corresponding to the real numbers. In a 1878 publication Cantor investigated ways of defining one-to-one correspondences between sets. For sets of real numbers and the like, he stipulated that two sets have the same power if there is a one-to-one correspondence between them and that a set x has a higher power than a set y if y has the same power as a subset of x yet x and y do not have the same power. He managed to show that the continuum, the plane, and generally n -dimensional Euclidean space all have the same power, but at this point in mathematics there were still only the two infinite powers as set out by his 1873 proof. Cantor at the end of his 1878 publication conjectured:

Every infinite set of real numbers either is countable or has the power of the continuum.

This was the Continuum Hypothesis (CH) in its nascent context. The Continuum Problem would be to resolve this hypothesis, and viewed as a primordial problem it would stimulate Cantor both to approach the real numbers in an increasingly arithmetical fashion and to grapple with fundamental questions of set existence.

In his magisterial Grundlagen of 1883, Cantor developed the transfinite numbers and the key concept of well-ordering. Investing the "symbols of infinity" of his early trigonometric series investigations with a new autonomy, Cantor conceived of the transfinite numbers as being generated by the operations of taking successors and of taking limits of increasing sequences. Extending beyond the finite 0, 1, 2, , the progression of transfinite numbers could be depicted, in his later notation, in terms of natural extensions of arithmetical operations:
0, 1, 2, ω, ω + 1, ω + 2, ω + ω(= ω· 2), ω·3, ω·ω(= ω2), ω3, ωω,


A binary relation is a linear ordering of a set a if it is transitive, that is, x y and y z implies x z, and trichotomous, that is, for x, y a, exactly one of x y, x = y, or y x holds.

A relation is a well-ordering of a set a if it is a linear ordering of the set such that every non-empty subset has a -least element.

Well-orderings convey the sense of sequential counting, and the transfinite numbers serve as standards for gauging well-orderings. As Cantor pointed out, every linear ordering of a finite set is already a well-ordering and all such orderings are isomorphic, so that the general sense is only brought out by infinite sets. For these there could be non-isomorphic well-orderings. For example the set of natural numbers {0, 1, 2, }, that is, the predecessors of ω, can be put into one-to-one correspondence with the predecessors of ω + ω by sequentially counting the evens before the odds. In fact all the infinite transfinite numbers in the above display are countable. Cantor called the set of natural numbers the first number class (I) and the set of numbers whose predecessors are in one-to-one correspondence with (I) the second number class (II). Cantor conceived of (II) as bounded above according to a limitation principle and showed that (II) itself is not countable. Proceeding upward, Cantor called the set of numbers whose predecessors are in one-to-one correspondence with (II) the third number class (III), and so forth. In this way Cantor conceived of ever higher powers as represented by number classes and moreover took every power to be so represented. With this "free creation" of numbers, Cantor then propounded in section 3 of the Grundlagen a basic principle that was to drive the analysis of sets:

It is always possible to bring any well-defined set into the form of a well-ordered set.

He regarded this as a "an especially remarkable law of thought which through its general validity is fundamental and rich in consequences." Sets are to be well-ordered and thus to be gauged via the transfinite numbers of his structured conception of the infinite.

The transfinite numbers provided the framework for Cantor's two approaches to the Continuum Problem, one through power and the other through definable sets of real numbers, these each to initiate two vast research programs. As for the first, Cantor in the Grundlagen established results that reduced the Continuum Problem to showing that the continuum and the second number class have the same power. However, despite several announcements Cantor could never develop a workable correlation, an emerging problem being that he could not define a well-ordering of the real numbers. As for the approach through definable sets of real numbers, Cantor showed that "CH holds for closed sets." Closed sets are a very simple kind of definable set of real numbers, and Cantor showed that a closed set either is countable or has the power of the continuum. He thus reduced the Continuum Problem to determining whether there is a closed set of real numbers of the power of the second number class. He could not do this, but he had established the first result of descriptive set theory, the definability theory for the continuum.

Almost two decades after his initial 1873 proof, Cantor in a short 1891 note gave his now celebrated diagonal argument. He proceeded in terms of functions, ushering in collections of arbitrary functions into mathematics, but we state and prove his result as is done nowadays in terms of the power set 𝒫(x ) = {y | y x } of a set x, the collection of all its subsets: For any set x, 𝒫(x) has a higher power than x.

First, the function associating each a x with {a }, that subset of x with sole member a, is a one-to-one correspondence between x and a subset of 𝒫(x ). Assume now to the contrary that there is a one-to-one correspondence F established between the members of x and all the members of 𝒫(x ). Consider the "diagonal" set d = {a | a x and a F (a )} consisting of those members a of x that do not belong to their corresponding subset F (a ). If d itself were a value of F, say d =F (b ) for some b x, then we would have the paradigmatic contradiction: b d exactly when b d. Hence, F was not a one-to-one correspondence after all!

Cantor had been shifting his notion of set to a level of abstraction beyond sets of real numbers and the like, and the casualness of his 1891 note may reflect an underlying cohesion with his earlier 1873 argument. Indeed the diagonal argument can be drawn out of the earlier argument, and the new result generalized the old since, with N the set of natural numbers, 𝒫(N ) is in one-to-one correspondence with the continuum. With his new result Cantor affirmed that the powers of well-defined sets have no maximum, and he had proved for the first time that there is a power greater than that of the continuum. However, with his view that every well-defined set is well-ordered Cantor would now have had to confront, in his arbitrary function context, a general difficulty starkly abstracted from the Continuum Problem: From a well-ordering of a set a well-ordering of its power set is not necessarily definable. The diagonal proof called into question Cantor's very notion of set.

Cantor's Beiträge, published in two parts in 1895 and 1897, presented his mature theory of the transfinite. In the first part Cantor reconstrued power as cardinal number, an autonomous concept beyond being une façon de parler about one-to-one correspondence. He defined the addition, multiplication, and exponentiation of cardinal numbers primordially in terms of set-theoretic operations and functions. As befits the introduction of new numbers Cantor then introduced a new notation, one using the Hebrew letter aleph, . With 0 the cardinal number of the set of natural numbers Cantor showed that
0·0=0 and 20 is the cardinal number of the continuum
(and hence of 𝒫(N )). With this he observed that the 1878 labor of associating the continuum with the plane and so forth could be reduced to a "few strokes of the pen" in his new arithmetic. Cantor only mentioned
0, 1, 2, , α, ,
these to be the cardinal numbers of the successive number classes from the Grundlagen and thus to exhaust all the infinite cardinal numbers.

Cantor then developed his theory of order types, "types" or abstractions of linear orderings. He defined the addition and multiplication of order types and characterized the order types of the rational numbers and of the real numbers. In the second Beiträge Cantor turned to the special case of well-orderings and reconstrued the transfinite numbers as their order types, newly calling the numbers the ordinal numbers. He then established their basic comparability properties by showing that given two well-orderings one is isomorphic to an initial segment of the other or vice versa. In this new setting he concentrated on the countable ordinal numbers, the new construal of the second number class, and provided an incisive structural analysis in terms of a new operation of ordinal exponentiation.

The two parts of the Beiträge were not only distinct by subject matter, cardinal number and the continuum vs. ordinal number and well-ordering, but also between them there developed a wide, insurmountable breach. In the first part nowhere is the 1891 result stated even in a special case, though it was now possible to express it as m < 2m for any cardinal number m , since in his arithmetic

2m is the cardinal number of the power set of a set with cardinal number m .

Also, the second Beiträge does not mention any aleph beyond 1, nor does it mention the Continuum Hypothesis, which could have been stated as 20 = 1. Every well-ordered set, through a corresponding ordinal number, has an aleph as its cardinal number, but how does 20 fit into the aleph sequence?

Thus the Continuum Problem was embedded in the very interstices of the early development of set theory, and in fact the structures that Cantor built, while now of great intrinsic interest, emerged out of efforts to articulate and solve the Continuum Problem. The tension uncovered by Cantor's diagonal argument between well-ordering and power set (or arbitrary functions) would soon be revisited by Zermelo. David Hilbert, when he presented his famous list of twenty-three problems at the 1900 International Congress of Mathematicians at Paris, made the Continuum Problem the very first problem and intimated Cantor's difficulty by suggesting the desirability of "actually giving" a well-ordering of the real numbers.

At the turn into the twentieth century the "logical" limits of set formation and existence were broached for sets being counterparts to "concepts" or properties. In correspondence with Hilbert and Richard Dedekind in the late 1890s Cantor became newly engaged with questions of set existence. He had earlier considered collections like all ordinal numbers or all alephs as leading out of his conceptual framework. These "absolutely infinite or inconsistent multiplicities," if admitted as sets, would lead to contradictions, and Cantor argued anew that every set can be well-ordered else it would in one-to-one correspondence with all the ordinal numbers and hence an inconsistent multiplicity. In this he anticipated later developments in set theory.

Bertrand Russell, a main architect of the analytic tradition in philosophy, focused in 1900 on Cantor's work. Russell was pivoting from idealism toward logicism, the thesis that mathematics can be founded in logic. Taking a universalist approach to logic with all-encompassing categories, Russell took the class of all classes to have the largest cardinal number but saw that Cantor's 1891 result leading to higher cardinal numbers presented a problem. Analyzing that argument, by the spring of 1901 he arrived at the famous Russell's Paradox. This paradox showed with remarkable simplicity that there are properties P (x ) such that the collection of objects having that property, the class
{x | P (x )},
cannot itself be an object: Consider {x | x x }. If this were an object r in the range of possibilities, then we would have the contradiction r r exactly when r r. This paradox may have been critical for Russell's universalist approach to logic and for logicism, but it was less so for the development of set theory, which was emerging in mathematics. In any case the paradox did serve as a motivation for fashioning a consistent notion of set through axiomatization.

The first decade of the new century saw Zermelo make his major advances in the development of set theory. Already estimable as an applied mathematician, Zermelo turned to set theory and its foundations under the influence of Hilbert. Zermelo's first substantial result was his independent discovery of the argument for Russell's Paradox. He then established in 1904 the Well-Ordering Theorem, that every set can be well-ordered, assuming what he soon called the Axiom of Choice (AC). Zermelo thereby shifted the notion of set away from the implicit assumption of Cantor's principle that every well-defined set is well-ordered and replaced that principle by an explicit axiom about a wider notion of set.

In retrospect Zermelo's argument for his Well-Ordering Theorem can be viewed as pivotal for the development of set theory. To summarize the argument, suppose that x is a set to be well-ordered, and through Zermelo's Axiom-of-Choice hypothesis assume that the power set 𝒫(x ) = {y | y x } has a choice function, that is, a function γ such that for every non-empty member y of 𝒫(x ), γ(y ) y. Call a subset y of x a γ-set if there is a well-ordering R of y such that for each a y,
γ({z | z y or z R a fails}) = a.
That is, each member of y is what γ "chooses" from what does not already precede that member according to R. The main observation is that γ-sets cohere in the following sense: If y is a γ-set with well-ordering R and z is a γ-set with well-ordering S, then y z and S is a prolongation of R, or vice versa. With this, let w be the union of all the γ-sets, that is, all the γ-sets put together. Then w too is a γ-set, and by its maximality it must be all of x and hence x is well-ordered.

Note that the converse to this result is immediate in that if x is well-ordered, say with a well-ordering , then the power set 𝒫(x ) has a choice function δ, namely for each non-empty member y of 𝒫(x ), let δ(y ) be the the -least member of y. Not only did Zermelo's argument analyze the connection between well-ordering and choice functions, but it anticipated in its defining of approximations and taking of a union the proof procedure for von Neumann's Transfinite Recursion Theorem.

Zermelo maintained that the Axiom of Choice, to the effect that every set has a choice function, is a "logical principle" which "is applied without hesitation everywhere in mathematical deduction," and this is reflected in the Well-Ordering Theorem being regarded as a theorem. Cantor's work had served to exacerbate a growing discord among mathematicians with respect to two related issues: whether infinite collections can be mathematically investigated at all, and how far the function concept is to be extended. The positive use of an arbitrary function operating on arbitrary subsets of a set having been made explicit, there was open controversy after the appearance of Zermelo's proof. This can be viewed as a turning point for mathematics, with the subsequent tilting toward the acceptance of the Axiom of Choice symptomatic of a conceptual shift in mathematics.


In response to his critics Zermelo published a second proof of the Well-Ordering Theorem in 1908, and with axiomatization assuming a general methodological role in mathematics he also published in 1908 the first full-fledged axiomatization of set theory. But as with Cantor's work, this was no idle structure building but a response to pressure for a new mathematical context. In this case it was not for the formulation and solution of a problem like the Continuum Problem, but rather to clarify a proof. Zermelo's motive in large part for axiomatizing set theory was to buttress his Well-Ordering Theorem by making explicit its underlying set existence assumptions. Effecting the first transmutation of the notion of set after Cantor, Zermelo ushered in a new abstract, prescriptive view of sets as solely structured by membership and governed by axioms, a view that would soon come to dominate.

The following are Zermelo's axioms, much as they would be presented today. They are to govern the connections between and and to prescribe the generation of new sets out of old. The standard axiomatization would be the result of adding two further axioms and formalizing in first-order logic.

axiom of extensionality

Two sets are equal exactly when they have the same members. Thus sets epitomize the extensional view of mathematics, it being stipulated that however sets are arrived at, there is a definite criterion for equality provided solely by membership.

axiom of empty set

There is a set having no members. This axiom serves to emphasize the beginning with an initial set, the empty set, denoted .

axiom of pairs

For any sets x and y, there is a set consisting of exactly x and y as members. The posited set is denoted {x,y } and is called the (unordered) pair of x and y. {x,x } is denoted {x }, as we have already seen, and is called the singleton of x.

axiom of union

For any set x, there is a set consisting exactly of those sets that are members of some member of x. The posited set is denoted x and is called the union of x. This "generalized" union subsumes the better known binary union, in that for any sets a and b,
a b = {a,b } = {x | x a or x b }.
If a set x is structured as an indexed set {x i | i I }, then x is often written as i Ixi or just ixi.

axiom of power set

For any set x, there is a set consisting exactly of the subsets of x. The posited set is denoted 𝒫(x ) and is called the power set of x, as we have already seen.

axiom of choice

For any set x consisting of non-empty, pairwise disjoint sets, there is a set c such that every member of x has exactly one element in c. Thus, c acts like a choice function for x construed as a family of sets. This is a reductive way of positing choice functions.

axiom of infinity

There is a set having as a member and such that whenever y is a member, so also is y {y }. This has become the usual way of positing the existence of an infinite set, in light of the definition of ordinals. Zermelo actually stated his axiom with "y {y }" replaced by "{y }," getting at a set describable informally as {, {}, {{}}, }.

axiom of separation

For any set x and definite property P, there is a set consisting exactly of those members of x having the property P. Once a collection has been comprehended as a set, we are able to form a subset by "separating" out according to a property. Or, a subclass of a set is a set. Taking the property of being a member of a given set a, we have as a set the binary intersection
x a = {y | y x and y a }.
Taking the property of not being a member of a, we have as a set the set-theoretic difference
x a = {y | y x and y a }.
As a further use of the axiom, consider for a set x the intersection of x :
x = {a | a y for every y x }.
This is a (property-specifiable) subclass of any member of x, and so we have as a theorem: If x then x is a set. This is a "generalized" intersection, with the better known binary intersection being {x, a } = x a.

According to Zermelo a property is "definite if the fundamental relations of the domain, by means of the axioms and the universally valid laws of logic, determine without arbitrariness whether it holds or not." But with no underlying logic formalized, the ambiguity of definite property would become a major issue, one that would eventually be resolved only decades later through first-order formalization. In any case Zermelo saw that the Separation idea suffices for a development of set theory that still allows for the "logical" formation of sets according to property. Russell's Paradox is forestalled since only "logical" subsets are to be allowed; indeed, Zermelo's first theorem was that there is no universal set, a set that contains every set as a member, the reductio argument being the paradox argument.

Stepping back, Extensionality, Empty Set, and Pairs served to lay the basis for sets. Infinity and Power Set ensured sufficiently rich settings for set-theoretic constructions. Tempering the logicians' extravagant and problematic "all," Power Set provided the provenance for "all" for subsets of a given set, just as Separation served to capture "all" for elements of a given set satisfying a property. Finally, Union and Choice completed the encasing of Zermelo's proof(s) of his Well-Ordering Theorem in the necessary set existence principles.

Although Hilbert's axiomatization of geometry in his 1899 Grundlagen der Geometrie may have served as a model for Zermelo's axiomatization of set theory and Dedekind's 1888 essay Was sind und was sollen die Zahlen? on the foundations of arithmetic a precursor, there are crucial differences having to do with subject matter and proof. Both in intent and outcome Dedekind and Hilbert had been engaged in the analysis of fixed subject matter. Dedekind in particular had done a great deal to enshrine proof as the vehicle for algebraic abstraction and generalization. Like algebraic constructs, sets were new to mathematics and would be incorporated by setting down rules for their proofs. Just as Euclid's axioms for geometry had set out the permissible geometric constructions, the axioms of set theory would set out rules for set generation and manipulation. But unlike the emergence of mathematics from marketplace arithmetic and Greek geometry, sets and transfinite numbers were neither laden with nor bolstered by substantial antecedents. There was no fixed, intended subject matter. Like strangers in a strange land stalwarts developed a familiarity with sets guided step by step by the axiomatic framework. For Dedekind it had sufficed to work with sets by merely giving a few definitions and properties, those foreshadowing Extensionality, Union, and Infinity. Zermelo provided more rules: Separation, Power Set, and Choice.

Zermelo's 1908 axiomatization paper, especially with its rendition at the end of the Cantorian theory of cardinality in terms of functions cast as set constructs, brought out Zermelo's set-theoretic reductionism. Zermelo pioneered the reduction of mathematical concepts and arguments to set-theoretic concepts and arguments from axioms, based on sets doing the work of mathematical objects. Set theory would provide the underpinnings of mathematics, and Zermelo's axioms would resonate with emerging mathematical practice. Zermelo's analysis moreover served to draw what would come to be generally regarded as set-theoretic out of the realm of the presumptively logical. This would be particularly salient for Infinity and Power Set and was strategically advanced by the segregation of property considerations to Separation. Based on generative and prescriptive axioms, set theory would become more combinatorial, less logical. With these features Zermelo's axioms indeed proved more than adequate to serve as a reductive basis for mathematics, at least for providing surrogates for mathematical objects; looking ahead it was for subsequent developments to bring out that set theory could also serve as a court of adjudication in terms of relative consistency.

Felix Hausdorff was the first developer of the transfinite after Cantor, the one whose work first suggested the rich possibilities for a mathematical investigation of the higher transfinite. A mathematician par excellence Hausdorff took the sort of mathematical approach to set theory and set-theoretic approach to mathematics which would come to dominate in the years to come. In a 1908 publication Hausdorff brought together his extensive work on uncountable order types, and in particular formulated the Generalized Continuum Hypothesis (GCH): For any infinite set x, there is no set of cardinal number strictly intervening between that of x and of its power set 𝒫(x ); or in Cantor's later terms, for every ordinal number α, 2α = α+1. Hausdorff also entertained for the first time a "large cardinal" concept, of which more below. Hausdorff's classic 1914 text, Grundzüge der Mengenlehre, broke the ground for a generation of mathematicians in both set theory and topology. He presented Cantor's and Zermelo's work systematically, and of particular interest, he applied the Axiom of Choice to provide what is now known as Hausdorff's Paradox. The source of the later and better known Banach-Tarski Paradox, Hausdorff's Paradox provided an implausible decomposition of the sphere and was the first, and a dramatic, synthesis of classical mathematics and the new Zermelian abstract view.

In the Grundzüge Hausdorff defined an ordered pair of sets in terms of (unordered) pairs, formulated functions in terms of ordered pairs, and ordering relations as collections of ordered pairs. Hausdorff thus capped efforts of logicians by making their moves in mathematics, completing the set-theoretic reduction of relations and functions. In the modern setting, the definition of the ordered pair that has been adopted is not Hausdorff's, but one provided by Kazimierz Kuratowski in 1921:
x,y = {{x }, {x, y }}.

This satisfies all that is operationally required of an ordered pair:
x,y = a,b exactly when x = a and y = b.
With this definition, a set r is a relation if it consists of ordered pairs. This objectification is often eased by reverting to the older conceptual notation a r b for a,b r. A set ƒ is a function if it is a relation satisfying: If x,y f and x,z f, then y = z. This objectification is eased by reverting to the older operational notation f (x ) = y for x, y f, though the emphasis is on the generality and arbitrariness of f as just a relation with a univalency property. Finally the dynamic notation f : a b specifies that f is a function such that every member of a is a first coordinate of an ordered pair in f, and that every second coordinate is a member of b.

Axiomatization Completed

In the 1920s fresh initiatives structured the loose Zermelian framework with new features and corresponding developments in axiomatics, the most consequential moves made by John von Neumann with anticipations by Dimitry Mirimanoff in a pre-axiomatic setting. Von Neumann effected a Counter-Reformation of sorts that led to the incorporation of a new axiom, the Axiom of Replacement: The transfinite numbers had been central for Cantor but peripheral to Zermelo; von Neumann reconstrued them as bona fide sets, the ordinals, and established their efficacy by formalizing transfinite recursion, the method of sequential definition of sets based on previously defined sets applied with transfinite indexing.

Ordinals manifest the basic idea of taking precedence in a well-ordering simply to be membership:


A set x is transitive if x x, that is, whenever a b and b x, then a x.

A set x is a (von Neumann) ordinal if x is transitive and the membership relation restricted to x = {y | y x } is a well-ordering of x.

For example, is transitive, but {{}} is not. Loosely speaking, transitive sets retain all their hereditary members. The first several ordinals are
, {}, {, {}}, {, {}, {, {}}},
and are newly taken to be the numbers 0, 1, 2, 3, . If x is an ordinal, then so also is x {x }, the successor of x, and this accounts for how the Axiom of Infinity was formulated in the previous section. It has become customary to use the Greek letters α, β, γ, to denote ordinals. Von Neumann, as had Mirimanoff before him, established the key instrumental property of Cantor's ordinal numbers for ordinals: Every well-ordered set is order-isomorphic to exactly one ordinal with membership. The proof made a paradigmatic use of Replacement, and so was the first proof to draw that axiom into set theory.

For a set x and property P (v, w ), the property is said to be functional on x if for any a x, there is exactly one b such that P (a, b ).

axiom of replacement

For any set x and property P (v, w ) functional on x, {b | P (a, b ) for some a x } is a set.

This axiom allows for new sets that result when members of a set are "replaced" according to a property. If the functional property is given by a set, that is there is a function f, a set of ordered pairs, such that P (v, w ) exactly when f (v ) = w, then Replacement is not needed. However, as in the case of the above-stated result correlating arbitrary well-orderings with ordinals, there are functional properties that are more general, typically formulated by recursion.

Replacement subsumes Separation. Suppose that x is a set and P is a (definite) property. If there are no members of x satisfying P, then we are done. Otherwise, fix such a member y 0. For any a x, let P (a, a ) hold if a satisfies P and P (a, y 0) hold otherwise. Then the "replaced" set {b | P (a, b ) for some a x } is the set of members of x satisfying P.

Von Neumann took the crucial step of ascribing to the ordinals the role of Cantor's ordinal numbers with their several principles of generation. Now, with ordinal numbers regarded as gauging well-orderings, that one is isomorphic to a proper initial part of another corresponds for ordinals to actual membership and can be rendered
α < β exactly when α β.
For this reconstrual of ordinal numbers and already to define the arithmetic of ordinals von Neumann saw the need to establish the Transfinite Recursion Theorem, the theorem that validates definitions by recursion along well-orderings. The proof was anticipated by the Zermelo 1904 proof, but Replacement was necessary even for the very formulation, let alone the proof, of the theorem. With the ordinals in place von Neumann completed the restoration of the Cantorian transfinite by defining the cardinals as the initial ordinals, those ordinals not in one-to-one correspondence with any of its predecessors. The infinite initial ordinals are denoted
ω = ω0, ω1, ω2, , ωα, ,
so that ω is to be the set of natural numbers in the ordinal construal, and the identification of different intensions is signaled by
ωα = α
with the left being a von Neumann ordinal and the right being the Cantorian cardinal number. Every set x, with AC, is well-orderable and hence in one-to-one correspondence with an initial ordinal ωα, and the cardinality of x is |x | = α. It has become customary to use the middle Greek letters κ, λ, μ, to denote initial ordinals in their role as the cardinals. A successor cardinal is one of form α+1 and is denoted κ+ for κ = α. A cardinal which is not a successor cardinal is a limit cardinal.

Replacement has been latterly regarded as somehow less necessary or crucial than the other axioms, the purported effect of the axiom being only on large-cardinality sets. Initially Abraham Fraenkel and Thoralf Skolem had independently in 1922 proposed the addition of Replacement to Zermelo's axioms, both pointing out the inadequacy of Zermelo's axioms for establishing that E = {Z 0, 𝒫(Z 0), 𝒫(𝒫(Z 0)), } is a set, where Z 0 = {, {}, {{}}, } is Zermelo's infinite set from his Axiom of Infinity. However even F = {, 𝒫(), 𝒫(𝒫()), } cannot be proved to be a set from Zermelo's axioms: The union of E above, with membership restricted to it, models Zermelo's axioms yet does not have F as a member. Hence Zermelo's axioms cannot establish the existence of some simple countable sets consisting of finite sets and could be viewed as remarkably lacking in closure under finite recursive processes. If the Axiom of Infinity were itself modified to entail that F is a set, then there would still be many other finite sets a so that {a, 𝒫(a), 𝒫(𝒫(a)), } cannot be proved to be a set. Replacement serves to rectify the situation by allowing new infinite sets defined by "replacing" members of the one infinite set given by the Axiom of Infinity. In any case the full exercise of Replacement is part and parcel of transfinite recursion, and it was von Neumann's formal incorporation of this method into set theory, as necessitated by his proofs, that brought in Replacement.

Von Neumann (and before him Mirimanoff, Fraenkel, and Skolem) also considered the salutary effects of restricting the universe of sets to the well-founded sets. The well-founded sets are the sets that belong to some "rank" V α, these definable through transfinite recursion:
V 0 = ; V α + 1 = 𝒫(V α); and V δ = {V α | α < δ} for limit ordinals δ.
V ω consists of the "hereditarily finite" sets, ω V ω+1, and 𝒫 (ω) V ω + 2, and so already in these beginning levels there are set counterparts for many objects in mathematics. That the universe V of all sets is the cumulative hierarchy
V = {V α | α is an ordinal}.
is thus the assertion that every set is well-founded. Von Neumann essentially showed that this assertion is equivalent to a simple assertion about sets:

axiom of foundation

x (x x y x (x y = )).

Thus non-empty well-founded sets have -minimal members. If a set x satisfies x x then {x } is not well-founded; similarly if there are x 1 x 2 x 1, then {x 1, x 2} is not well-founded. Ordinals and sets consisting of ordinals are well-founded, and well-foundedness can be viewed is a generalization of being an ordinal that loosens the connection with transitivity. The Axiom of Foundation eliminates pathologies like x x and through the cumulative hierarchy rendition provides metaphors about building up the universe of sets and the possibility of inductive arguments to establish results about all sets.

In a remarkable 1930 publication Zermelo offered his final axiomatization of set theory as well as a striking, synthetic view of a procession of models that would have a modern resonance. Proceeding in what we would now call a second-order context, Zermelo extended his 1908 axiomatization by adjoining both Replacement and Foundation. The standard axiomatization of set theory
ZFC, Zermelo-Fraenkel with Choice,
is recognizable, the main difference being that ZFC is a first-order theory (see the next section); "Fraenkel" acknowledges Fraenkel's suggestion of adjoining Replacement; and the Axiom of Choice is explicitly mentioned.
ZF, Zermelo-Fraenkel,
is ZFC without AC and is a base theory for the investigation of weak Choice-type propositions as well as propositions that contradict AC.

Zermelo herewith completed his transmutation of the notion of set, his abstract, prescriptive view stabilized by further axioms that structured the universe of sets. Replacement and Foundation focused the notion of set, with the first providing the means for transfinite recursion and induction, and the second making possible the application of those means to get results about all sets. It is nowadays almost banal that Foundation is the one axiom unnecessary for the recasting of mathematics in set-theoretic terms, but the axiom is also the salient feature that distinguishes investigations specific to set theory as an autonomous field of mathematics. Indeed it can be fairly said that modern set theory is at base a study couched in well-foundedness, the Cantorian well-ordering doctrines adapted to the Zermelian generative and prescriptive conception of sets. With Replacement and Foundation in place, Zermelo was able to provide natural models of his axioms and to establish algebraic isomorphism, initial segment, and embedding results for his models. Finally Zermelo posited an endless procession of his models, each a set in the next, as natural extensions of their cumulative hierarchies.

Zermelo found a simple set-theoretic condition, being an inaccessible cardinal, that characterizes the ordinal heights of his models, that is those ordinals ρ such that the predecessors of ρ are exactly the ordinals of a model.


An infinite cardinal κ is singular if there is an x κ of smaller cardinality than κ which is cofinal in κ, that is to say for any α < κ there is a β x with α β. An infinite cardinal which is not singular is regular

An infinite cardinal κ is a strong limit if for any cardinal β < κ, 2β < κ.

An infinite cardinal κ is inaccessible if it is both regular and a strong limit.

0 is regular; 1, 2, and generally, all successor cardinals are regular. The limit cardinal ω is singular, since it has a countable cofinal subset {0, 1, 2, }. Hausdorff in 1908 had initially entertained the possibility of having a regular limit cardinal. Inaccessible cardinals had later been considered to be a stronger version that arithmetically incorporated power sets, but Zermelo provided the first structural rationale for them, as the delimiters of his natural models.

Inaccessible cardinals are the modest beginnings of the theory of large cardinals, a mainstream of modern set theory devoted to the investigation of strong hypotheses and consistency strength. Large cardinal hypotheses posit structure in the higher reaches of the cumulative hierarchy, most often by positing cardinals that prescribe their own inaccessible transcendence over smaller cardinals, and were seen by the 1970s to form a natural hierarchy of stronger and stronger propositions transcending ZFC.

The journal volume containing Zermelo's 1930 publication also contained Stanisław Ulam's seminal paper on measurable cardinals, which became the most pivotal of all large cardinals. For a set s, U is a (non-principal) ultrafilter over s if U is a collection of subsets of s containing no singletons; if x U and x y s, then y U ; if x U and y U, the x y U ; and for any x s, either x U or s x U. For a cardinal λ, an ultrafilter U is λ-complete if for any D U of cardinality less than λ, D U. Finally an uncountable cardinal κ is measurable if there is a κ-complete ultrafilter over κ. Thus, a measurable cardinal is a cardinal whose power set is structured with a two-valued "measure" having a strong closure property. Measurability embodied the first large cardinal confluence of Cantor's two legacies, the investigation of definable sets of reals and the extension of number into the transfinite: The concept was distilled from measure-theoretic considerations related to Lebesgue's measure for sets of real numbers, and it also entailed inaccessibility in the transfinite.

Formalization and Model-Theoretic Methods

Zermelo's 1930 publication was in part a response to Skolem's 1922 advocacy of the idea of framing Zermelo's 1908 axioms in first-order logic. First-order logic investigates the logic of formal languages consisting of formulas built up from specified function and predicate symbols using logical connectives and first-order quantifiers and , these interpreted as ranging over the elements of a domain of discourse. (Second-order logic has quantifiers interpreted as ranging over properties, or collections of elements.) First-order logic had emerged in the 1917 lectures of Hilbert as a delimited system of logic potentially amenable to mathematical analysis. Entering from a different, algebraic tradition Skolem had established a seminal result for "metamathematical" methods with the Löwenheim-Skolem Theorem: If a countable collection of first-order sentences has a model then it has a countable model.

For set theory Skolem proposed formalizing Zermelo's axioms in the first-order language with and = as binary predicate symbols. Zermelo's definite properties were to be those expressible in this first-order language in terms of given sets, and the Axiom of Separation was to become a schema of axioms, one for each first-order formula that has variables allowing for set parameters. As a palliative for taking set theory as a foundation for mathematics, Skolem then pointed out what has come to be called the Skolem Paradox : Zermelo's 1908 axioms cast in first-order logic is a countable collection of sentences, and so if they have a model at all, they have a countable model. (Analogous remarks apply to the latterly adjoined Axiom of Replacement becoming a schema.) Thus we have the paradoxical existence of countable models for Zermelo's axioms although they entail the existence of uncountable sets. Zermelo found this antithetical and repugnant. However stronger currents were at work, leading to a further, subtler transmutation of the notion of set mediated by first-order logic and incorporating its relativism of set-theoretic concepts.

Gödel virtually completed the mathematization of logic by submerging metamathematical methods into mathematics. The main vehicle was the direct coding, "the arithmetization of syntax," in his celebrated 1931 Incompleteness Theorem, which worked dialectically against a program of Hilbert's for establishing the consistency of mathematics. But starting an undercurrent, the earlier 1930 Completeness Theorem for first-order logic clarified the distinction between the formal syntax and semantics (interpretations) of first-order logic, and secured its key instrumental property with the Compactness Theorem: If a collection of first-order sentences is such that every finite subcollection has a model, then the whole collection has a model.

Gödel's work showed that the notion of the consistency of a mathematical theory has a formal counterpart expressible in the first-order language with function symbols for addition and multiplication. Loosely speaking, a theory is a collection of sentences of some first-order language; that a sequence of formulas constitutes a deduction can be formalized; and a theory is consistent if from it no contradiction can be derived. Gödel's arithmetization of syntax codes all this into statements about the natural numbers and their arithmetic, yielding a formula
Con(T )
asserting the formal consistency of T, at least for those theories whose sentences can be schematically defined. Gödel famously established through his Incompleteness Theorem that for consistent theories subsuming the arithmetic of the natural numbers, Con(T ) itself cannot be deduced from T. However, one may be able to deduce relative notions:


A sentence σ is relatively consistent with a theory T if Con(T ) implies Con(T + σ).

A sentence σ is independent of a theory T if both σ and its negation are relatively consistent with T.

Two sentences σ1 and σ2 are equi-consistent over a theory T if Con(T + σ1) is equivalent to Con(T + σ2).

These assertions would be established over a weak base theory. For example, in the parlance, that a set-theoretic statement σ is relatively consistent with set theory generally means that Con(ZFC) implies Con(ZFC + σ), this itself deducible in (some weak version of) ZFC. Consistency strength in set theory can be discussed in these terms, typically for strong set theoretic statements not provable from ZFC: For two set-theoretic statements σ1 and σ2, the consistency strength of σ1 is least that of σ2 if Con(ZFC + σ1) implies Con(ZFC + σ2), and so σ1 and σ2 have equal consistency strength if σ1 and σ2 are equi-consistent over ZFC.

Tarski in the early 1930s completed the mathematization of logic by providing his "definition of truth," exercising philosophers to a surprising extent ever since. Tarski simply schematized truth as a correspondence between formulas of a formal language and set-theoretic assertions about an interpretation of the language and provided a recursive definition of the satisfaction relation, when a formula holds in an interpretation, in set-theoretic terms. This response to a growing need for a mathematical framework became the basis for model theory. The eventual effect of Tarski's mathematical formulation of semantics would be not only to make mathematics out of the informal notion of satisfiability, but also to enrich ongoing mathematics with a systematic method for forming mathematical analogues of several intuitive semantic notions. For coming purposes, the following specifies notation and concepts in connection with Tarski's definition:


For a first-order language, an interpretation N of that language (i.e., a specification of a domain of discourse as well as interpretations of the function and predicate symbols), a formula φ(v 1, v 2, , v n ) of the language with the variables as displayed, and a 1, a 2, ,an in the domain of N,
N φ[a1, a2, , an ]
asserts that the formula φ is satisfied in N according to Tarski's recursive definition when vi is interpreted as ai.

A subset y of the domain of N is first-order definable over N if there is a formula ψ(v 0, v 1, v 2, ,vn ) and a 1, a 2, , an in the domain of N such that
y = {z | N ψ[z, a 1, , a n ]}.

Set theory was launched on an independent course as a distinctive field of mathematics by Gödel's formulation of the model L of "constructible" sets, with which he established the relative consistency of the Axiom of Choice (AC) and the Generalized Continuum Hypothesis (GCH). L is a transitive class containing all the ordinals that, with the membership relation restricted to it, satisfies each axiom of ZFC as well as GCH. Through L Gödel established that Con(ZF) implies Con(ZFC + GCH) and thus attended to fundamental issues at the beginnings of set theory. In his first, 1938 announcement Gödel described L as a hierarchy "which can be obtained by Russell's ramified hierarchy of types, if extended to include transfinite orders." Indeed with L Gödel had refined the cumulative hierarchy of sets to a cumulative hierarchy of definable sets which is analogous to the orders of Russell's ramified theory. Gödel's further innovation was to continue the indexing of the hierarchy through all the (von Neumann) ordinals to get a model of set theory. In a 1939 note Gödel presented L essentially as it is presented today: For any set x let def(x ) denote the collection of subsets of x first-order definable over x according to the previous definition. Then define:
L 0 = ; L α + 1 = def (L α), L δ = {L α | α < δ} for limit ordinals δ;
and the constructible universe
L ={L α | α is an ordinal}.

Gödel brought into set theory a method of construction and argument and thereby affirmed several features of its axiomatic presentation. First Gödel showed that def(x) and generally first-order definability over set domains is itself definable in set theory, so that in particular the definition of L can be effected in set theory via transfinite recursion. This significantly contributed to a lasting ascendancy for first-order logic which beyond its sufficiency as a logical framework for mathematics was seen to have considerable operational efficacy. Gödel's construction moreover buttressed the incorporation of Replacement and Foundation into set theory. Replacement was immanent in the arbitrary extent of the ordinals for the indexing of L and in its formal definition via transfinite recursion. As for Foundation, underlying the construction was the well-foundedness of sets, and significantly, Gödel viewed L as deriving its contextual sense from the cumulative hierarchy of sets regarded as an extension of the simple theory of types. In footnote 12 of his 1939 note he wrote, "In order to give A [that V = L ] an intuitive meaning, one has to understand by 'sets' all objects obtained by building up the simplified hierarchy of types on an empty set of individuals (including types of arbitrary transfinite orders)." Some have been puzzled about how the cumulative hierarchy picture emerged in set-theoretic practice; although there was Mirimanoff, von Neumann, and especially Zermelo, the picture came in with Gödel's method, the reasons being both thematic and historical: Gödel's work with L with its incisive analysis of first-order definability was readily recognized as a signal advance, while Zermelo (1930) with its second-order vagaries remained somewhat obscure. As the construction of L was gradually digested, the sense that it promoted of a cumulative hierarchy reverberated to become the basic picture of the universe of sets.

In a notable inversion, what has come to be regarded as the iterative conception, the conception of sets as being built up through stages of construction as schematized by the cumulative hierarchy, has become a heuristic for motivating the axioms of set theory generally. This has opened the door to a metaphysical appropriation in the following sense: It is as if there is some notion of set that is "there," in terms of which the axioms must find some further justification. But set theory has no particular obligations to mirror some prior notion of set, especially one like the iterative conception, arrived at a posteriori. Replacement and Choice for example do not quite "fit" the iterative conception, but if need be, Replacement can be "justified" in terms of achieving algebraic closure of the axioms, a strong motivation in the work of Fraenkel and the later Zermelo, and Choice can be "justified" as a logical principle as Zermelo had maintained.

Gödel's proof of the GCH in L, like Zermelo's proof of the Well-Ordering Theorem, was synthetic and pivotal for the development of set theory. Gödel actually established that if λ is an infinite cardinal and x L λ, then for any y x in L, y L λ. The Power Set Axiom was thus tamed in L leading to the relative consistency of GCH. Replacement played a crucial role not only by providing for the prior extent of ordinals, but also in allowing this first instance of model-theoretic reflection. Reflection properties, which in one form came to be seen as equivalent to Replacement, assert that various properties holding at one level of the cumulative hierarchy holds at an earlier level, and they have been a leading heuristic for motivating large cardinals. Gödel's proof also made a specific, positive use of the Skolem Paradox argument, as he used what are now known as Skolem functions to take a Skolem hull. Paradox became method, affirming the operational efficacy of first-order logic. Finally Gödel took for the first time what is now known as the transitive collapse. Andrzej Mostowski would later state in general terms the result, which is a generalization to well-founded relations and transitive sets of the Mirimanoffvon Neumann result, that every well-ordered set is order-isomorphic to exactly one ordinal with membership. While that result was basic to the analysis of well-orderings, the transitive collapse result grew in significance from specific applications and came to epitomize how well-foundedness made possible a coherent theory of models of set theory.

In all these ways Gödel's work promoted a further transmutation of, or at least a new relativism about, the notion of set as mediated by first-order logic. By the 1950s ZFC was generally taken to be a theory formalized in first-order logic. The relativism of set-theoretic concepts was brought to the fore, as well as new possibilities for constructions of models of set theory. Results even about definable sets of real numbers would turn on contingencies of relative consistency. Notably, Gödel himself held a "Platonistic" conception of set theory as descriptive of an objective universe schematized by the cumulative hierarchy; nonetheless, his work laid the groundwork for the development of a range of models and axioms for set theory.

Gödel's work with L stood as an isolated monument for quite a number of years, World War II no doubt having a negative effect on mathematical progress. On the crest of a new generation Dana Scott established a result in 1961 that would become seminal for the theory of large cardinals. Utrafilters gained prominence in model theory in the late 1950s because of the emergence of the ultrapower and more generally ultraproduct construction for building concrete models, when Scott made the crucial move of taking the ultrapower of the universe V itself by an ultrafilter as provided by a measurable cardinal. Such an ultrafilter provided well-founded ultrapowers, and the full exercise of the transitive collapse now led to an inner model M and an elementary embedding j : V M.


M is an inner model if it is a transitive class containing all the ordinals that, with the membership relation restricted to it, satisfies each axiom of ZF.

A class function j : V M from the universe V of sets into an inner model M is an elementary embedding if for any set-theoretic formula φ(v 1, v 2, , vn ) and sets a 1, a 2, , an,
V φ[a 1, a 2, , an ] exactly when M φ[j (a 1), j (a 2), , j (an )].
(This suggests the general notion of elementary embedding in model theory; the notion cannot be formalized for V in ZFC, but sufficient schematic approximations can. Below, elementary embeddings are assumed not to be the identity function.) L is the paradigmatic inner model. Appealing to its definability Scott established: If there is a measurable cardinal, then V L. Large cardinal hypotheses thus assumed a new significance through a new proof construction, as a means for maximizing possibilities away from Gödel's delimitative universe. The ultrapower construction provided one direction and H. Jerome Keisler soon provided the other of a new characterization that established a central structural role for measurable cardinals: There is an elementary embedding j: V M for some inner model M exactly when there is a measurable cardinal. Through model-theoretic methods set theory was brought to the point of entertaining elementary embeddings into well-founded models, soon to be transfigured by a new method for getting well-founded extensions of well-founded models.


In 1963 Paul Cohen established the independence of the Axiom of Choice from ZF and the independence of the Continuum Hypothesis from ZFC. That is, complementing Gödel's relative consistency results with L Cohen established that Con(ZF) implies Con(ZF + the negation of AC) and that Con(ZFC) implies Con(ZFC + the negation of CH). These results delimited ZF and ZFC in terms of the two fundamental issues at the beginnings of set theory. But beyond that, Cohen's proofs were soon to flow into method, becoming the inaugural examples of forcing, a remarkably general and flexible method for extending models of set theory. Forcing has strong intuitive underpinnings and reinforces the notion of set as given by the first-order ZF axioms with conspicuous uses of Replacement and Foundation. If Gödel's construction of L had launched set theory as a distinctive field of mathematics, then Cohen's method of forcing began its transformation into a modern, sophisticated one. Cohen's particular achievement lies in devising a concrete procedure for extending well-founded models of set theory in a minimal fashion to well-founded models of set theory with new properties but without altering the ordinals. Set theory had undergone a sea-change, and beyond simply how the subject was enriched, it is difficult to convey the strangeness of it.

Cohen's approach was to start with a model M of ZF and adjoin a set G, one that would exhibit some desired new property. He realized that this had to be done in a minimal fashion in order that the resulting structure also model ZF, and so imposed restrictive conditions on both M and G. He took M to be a countable standard model, that is a countable transitive set that together with the membership relation restricted to it is a model of ZF. (The existence of such a model is an avoidable assumption in formal relative consistency proofs via forcing.) The ordinals of M would then coincide with the predecessors of some ordinal ρ, and M would be the cumulative hierarchy M = α<ρ (V αM ).

Cohen then established a system of terms to denote members of the new model, finding it convenient to use a ramified language: For each x M let ẋ be a corresponding constant; let Ġ be a new constant; and for each α < ρ introduce quantifiers α and α. Then develop a hierarchy of terms as follows: Ṁ 0 = {Ġ }, and for limit ordinals δ < ρ, Ṁ δ= α<δṀ α. At the successor stage, let Ṁ α + 1 be the collection of terms ẋ for x V α M and "abstraction" terms corresponding to formulas allowing parameters from Ṁ α and quantifiers α and α. It is crucial that this ramified language with abstraction terms is entirely formalizable in M, through a systematic coding of symbols. Once a set G is provided from the outside, a model M [G ] = α<ρ M α[G] would be determined by the terms, where each ẋ is to be interpreted by x for x M and Ġ is to be interpreted by G, so that: M 0[G ] = {G }; for limit ordinals δ < ρ, M δ[G ] = α<δ M α[G ]; and M α + 1[G ] consists of the sets in V αM together with sets interpreting the abstraction terms as the corresponding definable subsets of M α[G ] with α and α ranging over this domain.

But what properties can be imposed on G to ensure that M [G ] be a model of ZF? Cohen's key idea was to tie G closely to M through a system of sets in M called conditions that would approximate G. While G may not be a member of M, G is to be a subset of some Y M (with Y = ω a basic case), and these conditions would "force" some assertions about the eventual M [G ] that is, by deciding some of the membership questions whether x G or not for x Y. The assertions are to be just those expressible in the ramified language, and Cohen developed a corresponding forcing relation p φ, "p forces φ", between conditions p and formulas φ, a relation with properties reflecting his approximation idea. For example, if p φ and p ψ, then p φ & ψ. The conditions are ordered according to the constraints they impose on the eventual G, so that if p φ, and q is a stronger condition, then q φ. Scott made an important suggestion simplifying the definition for negation: p ¬ φ if for no stronger condition q does q φ. It was crucial to Cohen's approach that the forcing relation, like the ramified language, be definable in M.

The final ingredient is that the whole scaffolding is given life by incorporating a certain kind of set G. Stepping out of M and making the only use of its countability, Cohen enumerated the formulas of the ramified language in a countable sequence (shades of Skolem's Paradox!) and required that G be completely determined by a countable sequence of stronger and stronger conditions p 0, p 1, p 2, such that for every formula φ of the ramified language exactly one of φ or ¬ φ is forced by some pn. Such a G is called a generic set. Cohen was able to show that the resulting M [G ] does indeed satisfy the axioms of ZF: Every assertion about M [G ] is already forced by some condition; the forcing relation is definable in M ; and so the ZF axioms, holding in M, mostly crucially Power Set and Replacement, can be applied to derive corresponding forcing assertions about ZF axioms holding in M [G ].

The extent and breadth of the expansion of set theory described henceforth far overshadows all that has been described before, both in terms of the numbers of people involved and the results established. With clear intimations of a new and concrete way of building models, set theorists rushed in and with forcing were soon establishing a cornucopia of relative consistency results, truths in a wider sense, some illuminating classical problems of mathematics. Many different forcings were constructed for adding new real numbers and iterated forcing techniques were quickly broached.

Robert Solovay played a prominent role in the forging of forcing as a general method, and he above all in this period raised the level of sophistication of set theory across its breadth from forcing to large cardinals. Solovay proved a result already in 1964 remarkable for its sophistication: Suppose that κ is an inaccessible cardinal; then in an inner model of a forcing extension, κ becomes 1, the least uncountable cardinal, every set of real numbers is Lebesgue measurable, and Dependent Choices (a substantial form of AC for bolstering measure) holds. This model offered important insights into the possibilities of measure and the limits imposed by AC. The inaccessible cardinal was thought for some time to be an artifact of the proof, when in 1979 Saharon Shelah finally complemented Solovay's result by showing that if every set of real numbers is Lebesgue measurable and Dependent Choices holds, then 1 (in V ) is inaccessible in the constructible universe L.

Through the 1970s and into the 1980s the forcing method was honed with sophisticated iterated forcing techniques, techniques that established new, more contextualized relative consistency results in the self-generating mainstreams of set theory, infinitary combinatorics and cardinal invariants of the continuum. Donald Martin formulated an instrumental "axiom," Martin's Axiom (MA), in terms of forcing notions, an axiom that became convenient and focal for relative consistency results. MA together with the failure of CH is relative consistent with ZFC via forcing, and MA directly implies many combinatorial statements in a way analogous to how CH had, and so relative consistency results can be established by drawing direct consequences fom MA. A culmination in this direction was the work of Shelah in the 1980s on proper forcing, a wide class of forcing notions. Corresponding to MA in this context is the Proper Forcing Axiom, an axiom requiring large cardinals to establish its relative consistency. An important barrier that has resisted many efforts is that starting with a model of CH, many iterated forcing constructions have established the relative consistency of various propositions with the continuum being 2, but corresponding relative consistencies with the continuum being at least 3 are not known. Can this be a limitation of forcing, or a delimitation imposed by ZFC?

Large Cardinals and Inner Models

A subtle connection quickly emerged, already in the 1960s and into the 1970s, between large cardinals and combinatorial propositions low in the cumulative hierarchy: Forcing showed just how relative the Cantorian notion of cardinality is, since one-to-one correspondence functions could be adjoined to models of set theory easily, often with little disturbance. In particular large cardinals, highly inaccessible from below, were found to satisfy substantial propositions even after they were "collapsed" by forcing to 1 or 2, that is correspondence functions were adjoined to make the cardinal the first or second uncountable cardinals respectively. Conversely such propositions were found to entail large cardinal hypotheses in the clarity of an L -like inner model, sometimes the very same initial large cardinal hypothesis. Thus, in a subtle synthesis, hypotheses of length concerning the extent of the transfinite were correlated with hypotheses of width concerning the fullness of power sets low in the cumulative hierarchy, sometimes the arguments providing equi-consistencies. Solovay's Lebesgue measurability result from inaccessbility when complemented by Shelah's result became an equi-consistency, albeit a sophisticated one bringing together Cantor's two legacies, the investigation of definable sets of reals and the extension of number into the transfinite. Other "weak" large cardinals were formulated, sometimes in response to the need of a large cardinal concept to gauge a set-theoretic proposition via equi-consistency. The complementarity also encompassed "strong" large cardinal hypotheses formulated in terms of elementary embeddings and later, new canonical inner models.

Large cardinal hypotheses stronger than measurability were charted out in the late 1960s, motivated not only by the heuristics of generalization but also by those of reflection. The direct reflection heuristic is that various properties attributable to the class of all ordinals, since its extent is uncharacterizable, should be attributable already to some cardinal. This heuristic was already at work in Zermelo's 1930 paper and extends the closure provided by Replacement. The more subtle reflection heuristic is that strong large cardinal hypotheses posit elementary embeddings j : V M, and the closer the target inner model M is to V, the stronger the properties that translate and can be reflected between. The supercompact cardinals were thus formulated by Solovay and William Reinhardt as global generalizations of measurable cardinals; stronger than these were the n-huge cardinals; and the stronger hypotheses still were formulated. There is an ultimate delimitation in this direction that has framed the possibilities: Kenneth Kunen established in ZFC that there can be no elementary embedding j : V V of the universe into itself. ZFC rallied at last to force a veritable Götterdämmerung for large cardinals.

The theory of these strong hypotheses was developed particularly to investigate the possibilities for elementary embeddings. But what really intimated their potentialities were new forcing proofs, especially from supercompactness, that established the relative consistency of strong existence assertions low in the cumulative hierarchy, at the very least lending these assertions an initial plausibility. The possibility of new complementarity was then brought about through the development of inner model theory, the mostly sophisticated part of the theory of large cardinals.

Gödel's L was the first inner model, and Ronald Jensen dramatically transformed its investigation in the 1960s by refining the first-order definability and Skolem hull arguments to a "fine structure" analysis, extracting important combinatorial principles and establishing new relative consistencies. Inner models of measurability were soon developed, and their interactions and fine structure investigated, and these models would be paradigmatic for inner models of large cardinals: They exhibited in their crystalline clarity akin to algebraic closure the minimal consequences of the large cardinal hypothesis and the maximal structural regularity. In the 1970s, Jensen and Anthony Dodd developed the core model for measurability, and this would be paradigmatic for core models of large cardinals: These were inner models that did not contain the large cardinal, but exhibited the maximal possibilities "up to" the cardinal. The ascent through the large cardinal hierarchy had begun, the inner and core models providing an abiding sense of structure for large cardinal hypotheses.

The development of core models, while quickly developing a life of its own, was initially triggered by work on the Singular Cardinals Problem. With the advent of forcing it had been quickly seen that ZFC imposed little control on the powers 2κ of regular cardinals κ, successor or limit, since it became possible to extend a model of set theory by adjoining arbitrarily many subsets of such κ without adjoining any subsets of smaller cardinals. Thus Cantor's Continuum Problem and its generalization to regular cardinals were informed by a general manifestation of method. What about singular cardinals? Powers of singular cardinals seemed much less flexible with respect to forcing, and the Singular Cardinals Problem is the general problem of clarifying the possibilities for the function 2κ for singular cardinals κ. Jensen, who found a seminal 1974 result of Jack Silver on powers of singular cardinals "shocking," was directly inspired by it to establish the Covering Theorem for L, easily the most important result of the 1970s in set theory. Very loosely speaking this theorem asserts that unless a surprisingly simple proximity criterion between V and L holds, a large cardinal transcendence over L ensues. It was efforts to extend this result that led to the core models. Through forcing and inner model analysis, results especially of Moti Gitik of the late 1980s established equi-consistency results for simple assertions about powers of singular cardinals and showed remarkable level-by-level connections with large cardinals that affirmed their central place in the investigation of the transfinite.

The extensive research through the 1970s and 1980s considerably strengthened the view that the emerging hierarchy of large cardinals provides the hierarchy of exhaustive principles against which all possible consistency strengths can be gauged, a kind of hierarchical completion of ZFC. First the various hypotheses, though historically contingent, form a linear hierarchy, one neatly delimited by Kunen's inconsistency result. Typically for two large cardinal hypotheses, below a cardinal satisfying one there are many cardinals satisfying the other, in a sense prescribed by the first. And second, a variety of strong propositions have been informatively bracketed in consistency strength between two large cardinal hypotheses: the stronger hypothesis implies that there is a forcing extension in which the proposition holds; and if the proposition holds, there is an inner model satisfying the weaker hypothesis.

One of the great successes for large cardinals has to do with perhaps the most distinctive and intriguing development in modern set theory. Although the determinacy of games has roots as far back as a 1913 note of Zermelo, the concept of infinite games only began to be seriously explored in the 1960s when it was realized that it led to "regularity" properties for sets of real numbers like Lebesgue measurability.

With ω the set of natural numbers let ωω denote the set of functions from ω to ω. For A ωω, G (A ) denotes the following "infinite two-person game with perfect information": There are two players, I and II. I initially chooses an x (0) ω; then II chooses an x (1) ω; then I chooses an x (2) ω; then II chooses an x (3) ω; and so forth:

Each choice is a move of the game; each player before making each of his moves is privy to the sequence of previous moves ("perfect information"); and the players together specify an x ωω. I wins G (A ) if x A, and otherwise II wins. A strategy is a function from finite sequences of natural numbers to natural numbers that tells a player what move to make given the sequence of previous moves. A winning strategy is a strategy such that if a player plays according to it he always wins no matter what his opponent plays. A is determined if either I or II has a winning strategy in G (A ). The extent of the determinacy of games was investigated through hierarchies of definable sets of reals, and in 1962 the following sweeping axiom was proposed:

axiom of determinacy

Every A ωω is determined.

This axiom actually contradicts the Axiom of Choice, as one can get a counterexample A by "diagonalizing" through all strategies, and so the axiom was intended to hold at least in some inner model to establish regularity properties for sets of real numbers there. In the late 1960s initial connections were made between the Axiom of Determinacy and large cardinals by Solovay, who showed in ZF that the axiom implies that 1 is measurable, and by Martin, who showed in ZFC that if there is a measurable cardinal, then the analytic sets, the simplest significant sets of real numbers definable with quantifiers ranging over real numbers, are determined. Investigating further consequences of determinacy, a new generation of descriptive set theorists soon established an elaborate web of connections in the unabashed pursuit of structure for its own sake. Determinacy hypotheses seemed to settle many questions about definable sets of reals and to provide new modes of argument, leading to an opaque realization of the old Cantorian initiatives concerning sets of real numbers and the transfinite with determinacy replacing well-ordering as the animating principle. By the late 1970s a more or less complete theory for the "projective" sets of real numbers was in place, and with this completion of a main project of descriptive set theory attention began to shift to questions of overall consistency.

The investigation of the Axiom of Determinacy spurred dramatic advances in the theory of large cardinals and affirmed their central role in gauging consistency strength. In the 1970s the strength of the methods made possible by the axiom led to speculation that either the axiom was orthogonal to large cardinals or would subsume them in a substantial way. However, large cardinal hypotheses, first near Kunen's inconsistency and then around supercompactness, were shown to tame Determinacy. By looking at the workings of a proof, Hugh Woodin in 1984 formulated what is now known as a Woodin cardinal. Then Martin and John Steel showed that having more and more Woodin cardinals establishes the determinacy of more and more sets in the "projective hierarchy" of sets, sets of real numbers definable with quantifiers ranging over the real numbers. Finally Woodin established by 1992: the existence of infinitely many Woodin cardinals is equi-consistent with the Axiom of Determinacy. Woodin cardinals are weaker than supercompact cardinals, closer to measurable cardinals, and in subsequent developments the inner model theory was advanced to getting inner and core models of Woodin cardinals.

Woodin in the late 1990s built on the wealth of ideas surrounding Woodin cardinals and Determinacy and raising them to a higher level proposed a resolution of the Continuum Problem itself. This resolution features the use of arbitrarily many Woodin cardinals, the assimilation of new principles for sets of sets of real numbers, and an unresolved new conjecture about a new "logic" that would complete the picture. Thus structural ideas involving large cardinal hypotheses may circle back to effect an ultimate resolution of the original problem that stimulated the development of set theory.

What about the consistency of large cardinal hypotheses? As postulations for cardinals of properties of the class of all ordinals, they inherit substantial inaccessibility properties from below, but even for large natural numbers given notationally, the meaning of a number is not conveyed by its dogged approach from below but by its mathematical postulation and the sense given it by proof and method. The inner model theory has fortified large cardinals up to Woodin cardinals by providing them with coherent inner models whose structure incisively exhibit their consistency. As for the hypotheses near Kunen's inconsistency, since that result was based on a combinatorial contingency, it could well be that a like inconsistency for a weaker hypothesis can be established. In any case these near-inconsistency hypotheses are less relevant, the forcing proofs applying them to get initial plausibilities having given way to more refined arguments from weaker hypotheses. Moreover the work of Woodin has shown that there is also quite a lot of structure near the Kunen inconsistency, analogous to the descriptive set theory of real numbers.

Stepping back to gaze at modern set theory, the thrust of mathematical research should deflate various possible metaphysical appropriations with an onrush of new models, hypotheses, and results. Shedding much of its foundational burden, set theory has become an intriguing field of mathematics where formalized versions of truth and consistency have become matters for manipulation as in algebra. As a study couched in well-foundedness ZFC together with the spectrum of large cardinals serves as a court of adjudication, in terms of relative consistency, for mathematical propositions that can be informatively contextualized in set theory by letting their variables range over the set-theoretic universe. Thus set theory is more of an open-ended framework for mathematics rather than an elucidating foundation. It is as a field of mathematics proceeding with its own internal questions and capable of contextualizing over a broad range that set theory has become an intriguing and highly distinctive subject.

See also Cantor, Georg; First-Order Logic; Gödel, Kurt; Gödel's Theorem; Hilbert, David; Logical Paradoxes; Logic, History of: Modern Logic; Mathematics, Foundations of; Model Theory; Neumann, John von; Russell, Bertrand Arthur William; Second-Order Logic; Tarski, Alfred; Truth.


Bartoszy'nski, Tomek, and Haim Judah. Set Theory: On the Structure of the Real Line. Wellesley, MA: A K Peters, 1995.

Cantor, Georg. Contributions to the Founding of the Theory of Transfinite Numbers. Mathematische Annalen 46 (1895), 481512; and II. Mathematische Annalen 49 (1897): 312351; with introduction and notes by Philip E. B. Jourdain. Chicago: Open Court, 1915. Reprinted New York: Dover, 1965.

Dauben, Joseph W. Georg Cantor: His Mathematics and Philosophy of the Infinite. Cambridge, MA: Harvard University Press, 1979.

Ewald, William, ed. From Kant to Hilbert: A Source Book in the Foundations of Mathematics. Oxford: Clarendon Press, 1996.

Gödel, Kurt. Collected Works. Vol. 1. New York: Oxford University Press, 1986.

Gödel, Kurt. Collected Works. Vol. 2. New York: Oxford University Press, 1990.

Hallett, Michael. Cantorian Set Theory and Limitation of Size. Logic Guides #10. Oxford: Clarendon Press, 1984.

Jech, Thomas. Set Theory. Berlin: Springer Verlag, 2002.

Kanamori, Akihiro. The Higher Infinite: Large Cardinals in Set Theory from their Beginnings. 2nd ed. Berlin: Springer Verlag, 2003.

Kanamori, Akihiro. "Zermelo and Set Theory." The Bulletin of Symbolic Logic 10 (2004): 487553.

Kunen, Kenneth. Set Theory: An introduction to Independence Proofs. Amsterdam: North-Holland, 1980.

Moore, Gregory H. Zermelo's Axiom of Choice: Its Origins, Development, and Influence. New York: Springer-Verlag, 1982.

Moschovakis, Yiannis N. Descriptive Set Theory. Amerstadam: North-Holland, 1980.

Shelah, Saharon. Cardinal Arithmetic. Oxford: Clarendon Press, 1994.

Shelah, Saharon. Proper and Improper Forcing. 2nd ed. Berlin, Springer, 1998.

Van Heijenoort, Jean, ed. From Frege to Gödel: A Source Book in Mathematical Logic, 18791931. Cambridge, MA: Harvard University Press, 1967.

Woodin, W. Hugh. The Axiom of Determinacy, Forcing Axioms, and the Nonstationary Ideal. Berlin: Walter de Gruyter, 1999.

Woodin, W. Hugh. "The Continuum Hypothesis I and II." Notices of the American Mathematical Society 48 (2001): 567576 and 681690.

Akihiro Kanamori (2005)

Cite this article
Pick a style below, and copy the text for your bibliography.

  • MLA
  • Chicago
  • APA

"Set Theory." Encyclopedia of Philosophy. . 16 Aug. 2018 <>.

"Set Theory." Encyclopedia of Philosophy. . (August 16, 2018).

"Set Theory." Encyclopedia of Philosophy. . Retrieved August 16, 2018 from

Learn more about citation styles

Citation styles gives you the ability to cite reference entries and articles according to common styles from the Modern Language Association (MLA), The Chicago Manual of Style, and the American Psychological Association (APA).

Within the “Cite this article” tool, pick a style to see how all available information looks when formatted according to that style. Then, copy and paste the text into your bibliography or works cited list.

Because each style has its own formatting nuances that evolve over time and not all information is available for every reference entry or article, cannot guarantee each citation it generates. Therefore, it’s best to use citations as a starting point before checking the style against your school or publication’s requirements and the most-recent information available at these sites:

Modern Language Association

The Chicago Manual of Style

American Psychological Association

  • Most online reference entries and articles do not have page numbers. Therefore, that information is unavailable for most content. However, the date of retrieval is often important. Refer to each style’s convention regarding the best way to format page numbers and retrieval dates.
  • In addition to the MLA, Chicago, and APA styles, your school, university, publication, or institution may have its own requirements for citations. Therefore, be sure to refer to those guidelines when editing your bibliography or works cited list.