From 1905 to 1915 Albert Einstein revolutionized the conception of space and time and gravity that had been central in physics since Isaac Newton. For a brief discussion of the history of the development of relativity see the entry "Einstein, Albert." This entry describes the content of the theories.
The special and general theories of relativity are, at heart, theories of spatiotemporal structure. They are not particularly about observers or reference frames or ways to synchronize clocks, although as fundamental physical theories they have implications about what observers will observe and what various physical procedures for coordinating clocks will accomplish. It is easy to fall under the impression that these theories are basically concerned about coordinate systems or reference frames because physical events are typically described by means of coordinates or reference frames, but that temptation ought to be avoided.
Perhaps the easiest way to understand special relativity is by analogy to Euclidean geometry. Euclidean geometry postulates a particular spatial structure and, beginning with the Euclid's Elements, the implications of that structure for geometrical figures were studied by purely geometrical methods. For two millennia, the study of Euclidean geometry made no use of coordinate systems or of numbers. The introduction of Cartesian coordinates allowed for the translation of geometrical objects into algebraic ones by means of assigning numbers as coordinates to points. There are all sorts of ways to lay down coordinates on a Euclidean space, such as polar coordinates or spherical coordinates, but the most familiar is the system of Cartesian coordinates. Cartesian coordinates are rectilinear and orthogonal; the coordinate curves are straight lines that intersect at right angles. Because of this feature, distances between points in a Euclidean space are easy to calculate from their Cartesian cooridnates: If point p has coordinates (xp, yp, zp ) and point q has coordinates (xq, yq, zq ), then the distance from p to q is:
In most spaces, such as the surface of a sphere, Cartesian coordinates do not exist. It turns out that for a space to be Euclidean is just for the space to admit of Cartesian coordinates. That is, the distances between points in the space must be of just the right form for Cartesian coordinates to exist.
In order to grasp relativity, we have to think not of distances between points in a three-dimensional space, but of a fundamentally spatiotemporal distance between points in a four-dimensional space-time. Points in the space-time correspond to instantaneous, localized events, such as the bursting of a bubble when it reaches the surface of a glass of champagne. Such events occur both at a place and at a time. To locate these events, we typically ascribe to them four numbers, such as a latitude, longitude, altitude, and time. It is in this uncontroversial sense that the space-time of classical physics and of relativity is four-dimensional.
What sorts of spatiotemporal relations are there between events? All of classical physics agreed on at least one point: There is a definite, objective, purely temporal relation between the events. Two events either take place at the same time, or one takes place a certain amount of time before the other. So the notion of there being a lapse of time between events, and the specific case of simultaneity of events, is inherent in the classical account of space-time structure.
The classical account of spatial structure is not so straightforward. Newton believed that a single three-dimensional Euclidean space persists though time, and that every event, whenever it occurs, takes place somewhere in that absolute space. So Newton thought that any pair of events, no matter whether they occur at the same or different times, have some spatial distance separating them. But consider the following case: On a train traveling along the tracks, there sits a glass of champagne. A bubble rises to the surface and pops, followed a minute later by a second bubble. How far was the first popping from the second?
According to a passenger on the train, the two events took place in close spatial proximity, within a few inches of each other. But according to a spectator watching the train go by, these two events would be considered yards apart because the train has moved in the intervening minute. Newton would insist that there is a true spatial distance between the events, even though no observation could reveal for certain whether the passenger or the spectator (or neither) is right. But a natural reaction is to reject the whole question: There may be definite spatial relations between simultaneous events, but there is no fact at all about the spatial distance between nonsimultaneous events. Thus we arrive at two classical space-time structures: Newtonian space-time, with temporal and spatial relations between every pair of events, and Galilean (or neo-Newtonian) space-time, with temporal relations between all events and spatial relations only between simultaneous events (Galilean space-time then needs to add a new spatiotemporal structure, called an affine connection, to distinguish inertial from non-inertial trajectories). Note that the classical accounts agree on the temporal structure, and particularly on the objective physical relation of simultaneity.
Special relativity postulates a four-dimensional space-time with a radically different spatiotemporal structure. Instead of having a pure temporal structure and a pure spatial structure, there is a single relativistic "distance" between events (the scare quotes around distance must be taken seriously, as the quantity is not at all like a spatial distance). How can this spatiotemporal structure be specified?
The easiest method, albeit a bit roundabout, is by means of coordinates. Here we will take the analogy with Euclidean geometry quite seriously. As we saw, even though Euclidean geometry has no need of coordinate systems, the spatial structure of a Euclidean space can still be specified in this way; a Euclidean space is a space that admits of Cartesian coordinates. More specifically, a three-dimensional Euclidean space has a structure of distance relations among its points such that each point can be given coordinates (x,y,z) and the distance between any pair of points is:
In exactly the same way, we can specify the spatiotemporal structure of Minkowski space-time, the space-time of special relativity. Minkowski space-time is a four-dimensional manifold that admits of Lorentz coordinates (or Lorentz frames). A Lorentz frame is a system of coordinates (t, x, y, z) such that the relativistic spatiotemporal distance between any pair of events p and q is:
Written this way, the similarity with the example of Cartesian coordinates on Euclidean space is manifest; the only difference is the minus signs in place of plus signs. The consequences of that small mathematical difference are profound.
Before investigating the nature of this spatiotemporal structure, we should renew some of our caveats. First, there is always the temptation to invest the coordinates with some basic physical significance. For example, it is very natural to regard the coordinate we are calling t as a time coordinate, and to suppose that it has something to do with what is measured by clocks. But as of yet, we have said nothing to justify that interpretation. The Lorentz coordinates are just some way or other of attaching numbers to points such that the quantity defined above is proportional to the spatiotemporal distance between events. Indeed, just as there are many ways to lay down Cartesian coordinates on a Euclidean plane, systems differ with respect to the origin and orientation of the coordinate grid, so there are many ways to lay down Lorentz coordinates in Minkowski space-time. Different systems will assign different t values to the points, and will disagree about, for example, the difference in t value between two events. We do not invest these differences with any physical significance; because the various systems agree about the quantity defined above, they agree about all that is physically real.
A second caveat is in order. We have been speaking so far as if the spatiotemporal distance between events is itself a number (viz., the number that results when one plugs the coordinates of the events into the formula above). But it is easy to see that this is wrong even in the Euclidean case. Distances are only associated with numbers once one has chosen a scale, such as inches or meters. What exists as a purely geometrical, nonnumerical structure is rather a system of ratios of distances. Having chosen a particular geometrical magnitude as a unit, other magnitudes can be expressed as numbers (viz., the numbers that represent the ratio between the unit and the given magnitude). The Greeks had a deep insight when they divided mathematics into arithmetic (the theory of number) and geometry (the theory of magnitude). They recognized that the theory of ratios applied equally to each field, but kept the two subjects strictly separate. Our use of coordinates to associate curves in space with algebraic functions of numbers has blurred the distinction between magnitudes and numbers. To understand relativity, it is important to recognize the conventions employed to associate geometrical structure with numerical structure.
Holding these warnings in mind, let us turn to the relativistic spatiotemporal distance. What are the consequences of replacing the plus signs in the Euclidean distance function with minus signs?
One obvious difference between the Euclidean structure and the Minkowski structure is this: In Euclidean space, the distance between any two distinct points is always positive, and the only zero distance is between a point and itself. In mathematical terms, the Euclidean metrical structure is positive definite. But in the Minkowski structure, two distinct events can have zero distance between them. For example, the events with coordinates (0,0,0,0) and (1,1,0,0) have zero distance (where we list the coordinates in the order (t,x,y,z ). Of course, this does not mean that these two events are the same event; assigning the numerical value zero to this sort of distance is just a product of the conventions we have used for assigning numbers to the distances. But the fact that two events have a zero distance between them does show that they are related in a particular spatiotemporal way. In order to remind ourselves that these spatiotemporal distances do not behave like spatial distances, from now on we will call them spatiotemporal intervals.
If we choose a particular event, the popping of a particular champagne bubble, and call the event p, then we can consider the entire locus of events that have zero interval from p. There will be infinitely many such events. If p happens to be at the origin of a Lorentz frame, assigned coordinates (0,0,0,0), then among the events at zero interval from it are (1,1,0,0), (1,0,1,0), (5,0,-3,4), and (-6,4,-4,2). To get a sense of how these events are distributed in space-time, we draw a space-time diagram, but again one must be very cautious when interpreting these diagrams. The diagrams must repress one or two dimensions of the space-time, because we cannot draw four-dimensional pictures, but that is not the principle problem. The main problem is that the diagrams are drawn on a Euclidean sheet of paper, even though they represent events in Minkowski space-time. There is always the danger of investing some of the Euclidean structure of the representation with physical significance it does not have. Bearing that in mind, the natural thing to do is to suppress the z coordinate and draw the x, y, and t coordinates as the x, y, and z coordinates of three-dimensional Euclidean space.
Adopting these conventions, the points at zero interval from (0,0,0) will be points that solve the equation t 2 − x 2 − y 2 = 0, or t 2 = x 2 + y 2. The points that solve this equation form a double cone whose apex is at the origin. According to relativity, the intrinsic spatiotemporal structure associates such a double cone with every event in the space-time. This locus of points is called the light-cone of the event p, and divides into two pieces, the two cones that meet at p. These cones are called the future light-cone and the past light cone of p.
As the name light-cone suggests, we are now in a position to make contact between the spatiotemporal structure postulated by relativity and the behavior of physical entities. According to the laws of relativistic physics, any light emitted at an event (in a vacuum) will propagate along the future light-cone of the event, and any light that arrives at an event (in a vacuum) arrives along the past light-cone. So the tiny flash of light emitted when our champagne bubble pops races away from the popping event along its future light-cone. One can think of the ever-growing light-cone as representing the expanding circle (or, if we add back the z dimension, the expanding sphere) of light that originates at the bursting of the bubble.
Having associated the spatiotemporal structure with the behavior of an observable phenomenon such as light, we can now see how relativistic physics gains empirical content. For example, it is an observable fact that any pair of light rays traveling in parallel directions in a vacuum travel at the same speed; one light ray in a vacuum never overtakes another. This is not, of course, how material particles behave. One spaceship traveling in a vacuum can overtake another, or one electron in a vacuum can overtake another, because where a spaceship or an electron goes depends on more than the space-time location of the origin and direction of its journey. Two electrons can start out at the same place and time and set off in the same direction but end up in different locations because they were shot out at different speeds. Their trajectories depend on more than just the space-time structure. Light, in contrast, is intimately and directly tied to the relativistic space-time structure. Space-time itself, as it were, tells light in a vacuum where to go.
The assignment of zero relativistic interval between the origin of a light-cone and any event on it has one other notable consequence. We have already said that when we assign numbers to magnitudes, we want the ratios between the numbers to be identical to the ratios between the magnitudes. Because 0:0 is not a proper ratio, the relativistic interval does not license comparisons between the various intervals on a light-cone. If one light ray originates at (0,0,0,0) and travels to (1,1,0,0), and a second light ray originates at (0,0,0,0) traveling in some other direction, there is no fact about when the second light ray has gone as far as the first.
What other structure, beside the light-cone structure, does Minkowski space-time have? There is a well-defined notion of a straight line in the space-time, and this is accurately represented in our Euclidean space-time diagram: Straight lines in the Euclidean diagram correspond to straight trajectories in the space-time. Indeed, we have tacitly been appealing to the notion of a straight line all along; when we speak of the relativistic interval between two events, we mean the interval as measured along a straight line connecting the events, or, even more precisely, we mean the relativistic length of the straight line that connects the events. The straight-line structure (affine structure) of Minkowski space-time plays a central role in framing physical laws.
If a light ray is emitted from (0,0,0,0) into a vacuum, we already know that its trajectory through space-time will lie on the future light-cone of (0,0,0,0). But more than that, the trajectory will be a straight line on the light-cone. An analogous fact holds for material particles that travel below the speed of light. If a material particle is emitted from (0,0,0,0), its trajectory will lie entirely within the future light-cone of (0,0,0,0), which is to say that the particle can never travel at or above the speed of light. But more than that: If the particle is emitted into a vacuum, and is not subject to any forces, then its trajectory will be a straight line in space-time.
This law, in abstract form, enormously predates the theory of relativity. For this is just the proper space-time formulation of Newton's first law of motion: "Every body continues in its state of rest, or of uniform motion in a right line, unless compelled to change that state by forces impressed on it." The trajectory of a particle at rest or in uniform motion in Newtonian space-time is a straight line through the four-dimensional space-time. Newton's first law, stated in terms of space-time trajectories, also retains the same form in Galilean space-time, and can be taken over without change into Minkowski space-time. As we will see, in this abstract space-time formulation, Newton's first law also holds in the general theory of relativity. That is why we should try to formulate physical laws directly in terms of space-time structure.
Once we deal with material particles that travel below the speed of light, the relativistic interval takes on even greater significance. Consider a particle that travels from (0,0,0,0) to (5,4,0,0) along a straight trajectory (i.e., a particle emitted from the origin of the coordinate system that arrives at the event [5,4,0,0] without having any forces acting on it). The relativistic interval along its space-time trajectory is:
The size of this interval has direct physical significance; it is proportional to the amount of time that will elapse for a clock that travels along that trajectory. Clocks in the theory of relativity are like odometers on cars; they measure the length of the path they take. But length here means the interval, and path the space-time trajectory of the clock. Events in space-time separated by positive intervals are time-like separated.
It is not, of course, a further unanalyzable postulate of relativity that clocks measure the interval along their trajectory; clocks are physical mechanisms subject to physical analysis. But one can easily analyze how a simple clock will behave, such as a clock that counts the number of times a light ray gets reflected between two mirrors, and find that the reading on the clock will be proportional to the interval along the clock's trajectory.
With the clock postulate in hand, we can now analyze the notorious twins paradox of relativity. One of a pair of twins takes a rocket from Earth and travels to a nearby star. Upon returning to Earth, the twin has aged less than the stay-at-home sister, and the clocks in the twins' spaceship show less elapsed time than those that remained on Earth. Why is that?
To be concrete, suppose the event of the rocket leaving Earth is at the point (0,0,0,0) in our coordinate system, and the rocket travels inertially (without acceleration) to the point (5,4,0,0). The rocket immediately turns around, and follows an inertial trajectory back to Earth, arriving at the event (10,0,0,0). The interval between (0,0,0,0) and (5,4,0,0) is, as we have seen, 3. Suppose this corresponds to an elapse of three years according to the onboard clocks. The return trajectory from (5,4,0,0) to (10,0,0,0) also has an interval length 3, corresponding to another three years elapsed. So the astronaut twin arrives back having aged six years, and having had all the experiences that correspond to six years of life.
The stay-at-home twin, however, always remained at the spatial origin of the coordinate system. Her trajectory through space-time is a straight line from (0,0,0,0) to (10,0,0,0). So the interval along her trajectory is 10, corresponding to an elapse of ten years. She will have biologically aged ten years at her sister's return, and had four more years of experience than her twin.
The relativistic analysis of the situation is quite straightforward. It is really no more surprising, from a relativistic perspective, that the clocks of the twins will show different elapsed times from departure to return than it is surprising that two cars starting in the same city and end in the same city will show different elapsed mileage on their odometers, given that one took the freeway and the other a winding scenic route. The sense that there is a fundamental puzzle in the twins paradox only arises if one has mistaken views concerning the content of the theory of relativity.
In particular, it is often said that, according to the theory of relativity, all motion is the relative motion of bodies. If so, then there seems to be a complete symmetry between the twins: The motion of twin A relative to twin B is identical to the motion of twin B relative to twin A. But the relative motion of the twins plays no role at all in the physical analysis of the situation. The amount of time that elapses for twin B on her trip has nothing to do with what twin A is doing, or even if there is a twin A. The amount of time is just a function of the space-time interval along her trajectory.
It is also sometimes said that the theory of relativity gets rid of all absolute spatiotemporal structure; all facts about space and time are ultimately understood in terms of relations between bodies, so in a world with only one body there could be no spatiotemporal facts. This is also incorrect. The special theory of relativity postulates the existence of Minkowski space-time, whose intrinsic spatiotemporal structure is perfectly absolute, in whatever sense one takes that term. It is not a classical space-time structure, but it is not just a system of relations between bodies.
One occasionally also hears that the resolution of the twins paradox rests on facts about acceleration; the situation of the two twins is not exactly symmetric because the astronaut twin must accelerate (when she turns around to come home), whereas the stay-at-home twin does not. That is true, but irrelevant: The difference in elapsed time is a function of the intervals along the trajectories, not a function of the accelerations that the twins experience. Indeed, in the general theory of relativity we will be able to construct a twins scenario in which neither twin accelerates at all, but still they suffer different elapsed times between parting and reunion. It would be just as misleading to attribute the difference in elapsed time to the accelerations of the twins as it would the difference in odometer reading to the accelerations of the cars, even if the car that took the longer route did accelerate more.
The paradoxical or puzzling aspect of the twins paradox really arises from the difference between Euclidean geometry and Minkowski space-time geometry. If we draw the trajectories of the twins in space-time, we get a triangle whose corners lie at (0,0,0,0), (5,4,0,0), and (10,0,0,0). The astronaut twin travels along two edges of this triangle, whereas the stay-at-home twin travels along the third. And in Euclidean geometry, the sum of the lengths of any two sides of a triangle are greater than the length of the remaining side. But in Minkowskian geometry, the opposite is true: The sum of the intervals of two sides is less than the interval along the remaining side. Indeed, for time-like separated events, a straight line is the longest path between the two points in space-time. This is one consequence of exchanging the plus signs in the Euclidean metric for minus signs in the Minkowski metric.
The relativistic clock postulate has been most strikingly checked using natural clocks: unstable particles whose decay rate displays a known half-life in the laboratory. The muon, a sort of heavy electron, is unstable and will decay on an average of 10−6 seconds after having been created. Muons can be created in the upper atmosphere by collisions between molecules in the air and high-energy cosmic rays. According to clocks on Earth, it should take the muon about 10 × 10−6 seconds to reach the Earth, so very few should survive the trip without decaying. Nonetheless, many more muons than that calculation suggests do reach the Earth's surface. Calculation of the interval along the muon's trajectory predicts this because that interval corresponds to less than 10−6 seconds.
If we idealize muons a bit, and imagine that they all decay in exactly 10−6 seconds (according to their own clocks), then we can use them to map out the geometry of Minkowski space-time. Suppose we create a swarm of muons in space and send them out in all directions. Their decays will provide a map in space-time of events that are all the same interval from the point of creation. If we choose units so that the size of the interval corresponds to seconds, and we choose the creation of the muons as the origin of the coordinate system, then the coordinates of the decay events will satisfy:
This is the equation of a hyperboloid of revolution that asymptotically approaches the light-cone, as depicted below.
The hyperboloid represents events all at the same interval from (0,0,0,0), and so corresponds to a circle or sphere of fixed radius in Euclidean geometry. There would be a corresponding hyperboloid in the past light-cone, representing places from which a muon could have been sent that would have decayed at (0,0,0,0).
Indeed, we are now in a position to make a thoroughgoing analogy between the geometry of Minkowski space-time and Euclidean geometry that makes no reference to coordinates at all. Classical Euclidean geometrical proofs do not use coordinate systems of numbers, they use two instruments: the straightedge and the compass. The straightedge allows one to identify straight lines in the space, and the compass to draw the locus of points at a fixed distance from a given center. In Minkowski space-time, we can use light rays in a vacuum and inertially traveling particles as straightedges because their trajectories are straight lines in the space-time. Setting a Minkowski compass at interval zero and identifying a center should result in drawing the light-cone: the locus of points of interval zero from the center. So we can use light rays for this purpose. Setting the compass to draw points at a fixed positive interval should result in drawing hyperbola; we can use clocks for this just as the muons are employed above. In this way, we can free Minkowski geometry from coordinates altogether.
So far we have left one species of space-time relation out of account. All the points on the past or future light-cone of some event p are at zero interval from p. All the events inside the past or future light-cone are at positive interval from p (taking always the positive square root by convention). What of points that are outside the light-cone altogether?
The point labeled (0,1,0,0) is outside the light-cone of the point (0.0.0.0). If we plug these coordinates into our formula, we find that the interval between the points is:
That is, according to the definition of the interval that we have given, the interval between these points is imaginary. What could this mean?
Once again, we have to recall that the assignment of numbers to the intervals is somewhat a matter of convention. In fact, some physics books define the interval as:
Here the interval between time-like separated events becomes imaginary. Does this mean that a clock could measure an imaginary number? Of course it can: Just take a regular clock and paint a little i after all the numerals! The numbers we assign to intervals have no intrinsic significance; it is the ratios between the numbers that represent the ratios among the magnitudes. Events that lie outside each other's light cones, so-called space-like separated events, have intervals among them that also stand in ratios to each other. The set of events at fixed space-like separation from (0,0,0,0) forms another sort of hyperboloid of revolution, depicted below.
We now have a sense of the spatiotemporal structure of Minkowski space-time. A special relativistic physical theory must have laws that employ only this spatiotemporal structure. We could now go on to see how, for example, classical electromagnetic theory can be reformulated in this way, but that would take us too far from foundational issues.
It should be noticed that this account of special relativity has made no mention at all of several well-known features often associated with relativity, such as the constancy of the speed of light, the relativity of simultaneity, and the Lorentz-Fitgerald contraction. That is because all of these are frame-dependent (or coordinate system dependent) effects, and we have been presenting the theory in a frame-independent way. For example, we have no basis to discuss the relativity of simultaneity because we have had no ground, and no need, to introduce any notion of simultaneity at all. In classical physics, simultaneous events are events that take place at the same time, but we have no general notion of the time at which an event occurs, only the time that elapses on a clock following a certain trajectory. So the proper thing to say is not that special relativity implies the relativity of simultaneity, but that it implies the nonexistence of any objective notion of simultaneity. And we cannot discuss whether the speed of light is constant because we do not have any grounds to ascribe any speed to anything.
We have seen that a light ray can never overtake another light ray, but assessing a speed requires determining how far an object went in a given period of time. So far, we have not needed any notion of the distance an object travels, nor of the time that it takes to travel that distance. We can say how much time will elapse on a clock that follows a given trajectory, but that is evidently no use in defining a speed of light; no material clock can travel along with a light ray, and if it could, it would show no elapsed time for the journey. The notion of simultaneity requires a global time function, that is, an assignment of times to all events, so that there is a locus of events that are all assigned the same time. And the notion of a speed requires both the notion of the time that elapses between the start and the end of a journey, and the notion of the distance covered in that time. The relativistic space-time structure does not, per se, support either of these notions.
There is, however, a reasonably natural method for introducing both a global time function and a notion of spatial distance into Minkowski space-time. We begin with a family of co-moving inertial clocks (i.e., a family of clocks all moving on straight, parallel trajectories through space-time). There will be an infinitude of such families, corresponding to all the directions their trajectories can have. We begin by picking one such family.
We now want to "synchronize" the clocks. Scare quotes have to be put around the word since the classical notion of synchronization presupposes the notion of simultaneity: Synchronized clocks all show the same time at the same instant. But in relativity there is no such thing as the same instant. So one must think of the method we are about to describe as a way to coordinate a family of clocks that we simply call synchronizing them.
Let us choose a single master clock from our family of co-moving clocks. The other clocks will coordinate with this master clock by the following method: Each clock sends a light ray to the master clock, noting the time of emission (according to the sending clock). When the light ray reaches the master clock, it is immediately sent back and shows the time reading on the master clock at the moment it arrived. When this return signal reaches the sending clock, the time reading on the sending clock is noted. The sender, then, has three bits of data: the time it sent the signal (according to the sending clock), the time it received the return signal (ditto), and the reading on the master clock when the signal got to it. On this basis, the sending clock synchronizes with the master clock by adjusting its time so that the time that the master clock read when the signal arrived corresponds to the event on the sending clock exactly midway between the moment the signal was sent and the moment the return signal arrived. All of these notions are relativistically well-defined, so this method of coordinating clocks can be carried out. Every event in space-time is now assigned a time (viz., the reading on that member of the family of clocks that passes through the event when it passes through the event).
We can now identify simultaneity according to this family of clocks as sets of events that are all assigned the same time by this family of clocks. Such a set is called a simultaneity slice through the space-time. The figure below shows one such simultaneity slice. Because all of the light signals that reach the master clock at noon lie on the past light-cone of the master clock showing noon, and because all of the return signals lie on the future light-cone of that event, it is easy to calculate the points at which all of the coordinated clocks will register noon. It is the flat plane in the middle.
The simultaneity slice is a function of which family of co-moving clocks we choose. Choosing another family will give a different notion of simultaneity:
Each family of co-moving clocks determines its own notion of simultaneity, and these various notions render different judgments concerning which pairs of events happen at the same time. All the families will agree about the time order of time-like or light-like separated events, but for any pair of space-like separated events, some families will say that they happened at the same time, others that one happened first, and yet others that the other happened first. Each family introduces its own global time function. None of these functions is superior to the other, and none is needed at all to explicate the basic spatio-temporal structure.
What of spatial distance? Once a family of clocks has been synchronized, there is a simple way to assign a spatial distance between any pair of clocks. Send a light ray from one clock to the other. We can now understand the time of travel for the light ray as the difference between the time showing on the emitting clock at the emission event and the time showing on the receiving clock at the reception event. So we now have a definition of how long the light ray took to get from one clock to the other (again, this is not the time that a clock traveling along with the light ray would show elapsing). If we now define the speed of light to be a given constant, c, then we can say that the distance between the clocks is just c times the elapsed time of transmission. This will give us a structure of spatial distances between the clocks as defined by that particular family of clocks. Those spatial distances will, in special relativity, constitute a Euclidean space. Different families of clocks will disagree about the precise spatial distance between events, and about the spatial size of material objects, but each family will construct for itself a Euclidean spatial structure. Finally, if we allow such a family of clocks to introduce Cartesian coordinates on its Euclidean space, then the family will assign each event four coordinate numbers: the three spatial coordinates and the global time function. These are exactly the Lorentz coordinate frames that we began with to express the relativistic metric, so we have come full circle.
The interconnection between the global time defined by a family of clocks and the spatial structure among events defined by that family resolves many of the intuitive puzzles in special relativity. We have seen that, according to clocks at rest on the Earth, a high-energy muon has a much longer lifetime than a muon at rest. That explains, from the point of view of the Earth frame, how the muon manages to make the trip to the surface. But of course, from the point of view of the muon, and clocks co-moving with it, the muon lifetime is the normal 10-6 seconds. From their point of view, the Earth is approaching them at high velocity. In that frame of reference, the muon is able to get through the whole atmosphere not because of any slowing down of their clocks, but because of the spatial contraction of the atmosphere. In the muon's frame of reference, the distance from the upper atmosphere to the Earth is much less than we on Earth take it to be.
The Lorentz contraction and time dilation effects of relativity then arise as disagreements that occur between the Lorentz frames about the amount of time that elapses between events and the spatial distance between events. Clocks in any frame will be seen to run slow according to the time function associated with any other frame. A meter stick at rest in one frame will be judged to be less than a meter long according to a frame in which the stick is moving. These are symmetric effects: From the point of view of any Lorentz frame, clocks at rest in any other frame run slow. We need to sharply distinguish these effects from the twins paradox. There, the difference in elapsed time for each twin is a consequence of the fundamental spatiotemporal structure, and has nothing do to with frames or families of clocks. The time dilation between frames results only from different ways of defining coordinates. In the latter case, there is no fact about which set of clocks is really running slower, but in the former case there is an objective fact about which twin is biologically younger when they are reunited.
Special relativity is a theory that postulates a certain intrinsic spatiotemporal structure, and then formulates the laws of physics in terms of that structure. General relativity is the relativistic theory of gravity. It is also fundamentally a theory about spatiotemporal structure, and allows for different structures than special relativity. So the first question that arises when approaching general relativity is why gravity should particularly be connected to spatiotemporal structure. The special relativistic theory of electromagnetism, for example, simply accepts the Minkowski space-time and employs it in framing the electromagnetic laws. But gravity, in contrast, led to the rejection of special relativity in favors of a new theory. What is so special about gravity?
One sometimes hears that there needed to be a relativistic theory of gravity because Newton's gravitational theory postulates that gravity acts instantaneously between distant masses, but in relativity there is no available notion of instantaneous action (because there is no physical notion of simultaneity). But this observation does nothing to suggest that the theory of gravity should require any change from the special relativistic space-time. Classical electrostatics postulated that the coulomb force between distant charged particles acts instantaneously, but electromagnetic phenomena do not require changes to special relativity. Rather, relativistic electrodynamics simply rejects the claim that electric and magnetic forces act instantaneously. Electromagnetic influences are propagated along the light cones, at the speed of light, by electromagnetic waves. Similarly, one might think that the obvious way to deal with gravitation is simply to deny that it acts instantaneously. Let the gravitational effects also propagate along the light cones, and the special relativistic structure can be used to formulate the laws.
Adding such a delay in gravitational influence would, of course, modify the predictions of Newtonian gravity. One might even plausibly argue that Newton himself would have expected such a correction to his instantaneous gravity. For Newton thought that gravitational forces were mediated by particles exchanged between the gravitating bodies, and he would have expected the particles to take some time in traveling between the bodies. Of course, the fundamental cause of the gravitational force was a topic on which Newton refused to fingere any hypothesis, so we must be a bit speculative here. But it is worthwhile to note that if we modify classical Newtonian gravitational theory to allow gravitational influence to propagate along the light cones, we can exactly derive some famous relativistic effects, such as the anomalous advance in the perihelion of Mercury.
In order to understand why gravity is plausibly taken to be deeply connected to space-time structure, we need to look elsewhere. Consider again the family of comoving inertial clocks we made use of in our discussion of special relativity. Once set in motion, the family of clocks will move together, never approaching or receding from each other. That is because: a) the clocks are all traveling inertially, not subject to any force; b) according to the space-time version of Newton's first law, the trajectories of bodies subject to no forces will be straight lines in space-time; and c) the straight-line trajectories of the co-moving clocks form a family of parallel straight lines. Note that in giving this argument, we never had to mention the mass of any of the clocks. Because they are moving inertially, the trajectories of the clocks are determined by the intrinsic space-time structure, without the mass playing any role. It would not matter if some of the clocks were heavy and others light; they would still move together parallel to one another.
In Newtonian physics, the mass of a body only comes into consideration when a body is subject to a force and thereby deflected off its inertial trajectory. The inertial mass of a body is nothing but a measure of the body's resistance to being deflected by a force from its inertial trajectory: The more massive a body is, the harder it is to make its trajectory bend in space-time. Newton's second law, which we now render F = mA , tells us that the same force will only produce half the acceleration in a body that is twice as massive. So in the presence of forces, the trajectories of bodies will depend on their masses, whereas in the absence of forces the more and less massive bodies will move on parallel trajectories. Turning this observation around, we should find it very suggestive if there is a situation in which the trajectory of a body does not depend at all on its mass. It is natural to suspect that in such a situation, the mass of the body is playing no role because the body is not being subject to any force; it is moving inertially.
Recall Galileo at the top of the Leaning Tower of Pisa dropping a lighter and heavier object and seeing them hit the ground together. Here is a common situation in which the mass of an object does not affect its trajectory: The heavy and light follow the same space-time path. According to Newtonian gravitational theory, this is a rather fortuitous result. In that theory, both the heavy and the light object are subject to a force, the force of gravity, and so each is being deflected off its inertial trajectory. But, luckily, the gravitational force on each object is exactly proportional to its inertial mass. So the more massive object, which needs a greater force to be accelerated, is subject to a greater force than the less massive object. Indeed, the gravitational force on the more massive object is exactly as much larger as it needs to be to produce precisely the same acceleration as the lesser force of gravity produces on the less massive object. That, according to Newton, is why they fall together; they are both accelerated, but at exactly the same rate.
If we follow the hint above, though, we will be led to suspect a different account. Perhaps the two objects move together not because they are equally deflected off their inertial, straight-line trajectories, but rather because they are both following their inertial trajectories. Because the inertial trajectories are straight lines in space-time, this suggests a deep connection between gravity and fundamental spatiotemporal structure.
In this way we arrive at the general theory of relativity. According to general relativity, objects that are falling in a gravitational field or under the influence of a gravitational force are not being affected by any force at all. Gravity does not deflect objects from their inertial paths, it rather influences the very structure of space-time itself. The balls falling from the Leaning Tower of Pisa, or the planets orbiting the sun, are following straight trajectories through space-time.
To realize this theory, we must reject Minkowski space-time. Consider, for example, two satellites orbiting the Earth in opposite senses. The space-time diagram of the situation looks like this:
As the satellites orbit, their paths cross and recross in space-time. But in Minkowski space-time, as in Euclidean space, two straight lines can intersect only once at most. So the space-times of general relativity must have a different spatiotemporal structure than the space-time of special relativity.
An analogy with pure spatial geometry helps here. Euclidean geometry is just one of an infinitude of spatial geometries. Lines on the surface of a sphere, for example, do not satisfy Euclid's postulates. But even spherical geometry is highly regular compared to most geometries. Consider, for example, the surface of North America. In regions of the Great Plains, the geometry is nearly Euclidean (and even more nearly spherical), whereas in the Rocky Mountains the geometry of the surface varies wildly from place to place. We need new mathematical machinery to deal with this sort of situation.
The general mathematics needed is called differential geometry. Differential geometry is suited to deal with spaces whose geometrical structure varies from place to place. In some regions, a space may be locally Euclidean, in others non-Euclidean, so we have to be able describe the geometry region by region.
Euclidean spaces have a particularly uniform geometrical structure that allows them to admit of very convenient coordinate systems. As we have seen, a Euclidean space admits of Cartesian coordinates, in which the distances between points is a simple mathematical function of the coordinates of the points. Non-Euclidean spaces do not admit of such convenient systems. For example, points on a sphere can be coordinatized by latitude and longitude, but distances between the points on a sphere are not a simple function of their coordinate differences. If you are near the North Pole, you can change your longitude by several degrees just by taking a few steps; near the equator the same change of longitude would require traveling hundreds of miles. And even spherical coordinates are relativity simple and uniform.
To get a sense of a completely generic coordinate system, imagine walking down a road where each successive house has an address—one greater than the house before. You want to get to house number 200 and you are currently at house 100. How far must you walk?
There is no way to tell. If you go through a densely populated area, such as a small town, you will get to your destination quickly. If it is a sparsely built region, you may have to walk a long way. To know how far you have to go, you would need a complete listing of the distances between successive houses. If you have such a list, you can calculate the distance between any two houses, and so can reconstruct the geometrical structure of the region where the houses are built. In an analogous way, the general theory of spaces allows for the use of any arbitrary coordinate system. Accompanying the system is a metric that specifies the distances between nearby points. We do not have any general rule for calculating distances between distant points as a function of their coordinates, but we do not need one. The distance between faraway points is just the length of the straight path that connects them, and we can calculate the length of that path by knowing the distance between nearby points and adding up all the distances along the path. Thus we have the mathematical tools to deal with generic spaces of variable curvature that admit of nothing like Cartesian coordinates.
It is sometimes said that the general theory of relativity requires us to replace Euclidean space with a non-Euclidean space, but that is not a very useful, or accurate, explanation of the situation. As we have seen, even in special relativity the notion of spatial geometry is rather derivative and non-fundamental. The fundamental notion is the relativistic interval, which is a spatiotemporal object. It is only relative to a family of co-moving objects, such as clocks, that we can even define a spatial geometry. It turns out that, in special relativity, each such family will ascribe Euclidean geometry to its space, but that is somewhat fortuitous; there is no logical guarantee that the various families will agree on their findings. After all, in special relativity the various families will disagree about the exact spatial distance (and temporal gap) between a given pair of events. In general relativity, there will, in general, not exist families of co-moving inertial observers that maintain the same spatiotemporal relations to one another, and so there is no unproblematic way to define a spatial geometry at all.
In any case, it is simply incorrect to say that objects moving in a gravitational field trace out straight paths in a non-Euclidean spatial geometry. The orbits of the planets, for example, are nearly elliptical in any reasonably defined space for the solar system, and the ellipses are not (spatially) straight lines.
The proper account of general relativity rather employs an analogy. As the variably curved non-Euclidean spaces are to Euclidean space, so the variably curved space-times of general relativity are to Minkowski space-time. The orbits of the satellites depicted above are not straight paths in any spatial geometry, but they are straight paths in space-time. The effect of the Earth is not to produce a force that deflects the satellites off their inertial paths, it is to alter the space-time geometry so that it contains inertial paths that cross and recross.
On the Newtonian picture of gravity, when we sit on a chair we are not accelerated because we are acted on by counterbalancing forces: The gravitational force pulling us down and the force of the chair pushing us up. According to the general relativistic account, the force of the chair pushing up still exists, but it is unbalanced by any gravitational force. It follows that according to general relativity, as we sit we are constantly accelerated (i.e., constantly being deflected off of our inertial, straight-line trajectories through space-time). The inertial trajectory is that of an object unsupported by anything like the chair (i.e., an object in free fall).
The curvature of general relativistic space-time is partially a function of the distribution of matter and energy; that is why space-time near a massive object like the Earth is curved in such a way as to produce a gravitational field. This connection between the matter and energy distribution and the spatiotemporal geometry is provided by Einstein's general relativistic field equations. But although the distribution of matter and energy influences the space-time geometry, it does not completely determine it.
The situation is similar to the relationship between the electromagnetic field and the electric charge distribution in classical physics. The presence of electric charges contributes to the electromagnetic field, but does not, by itself, determine it. For example, even in a space devoid of electric charges, there can be a nonzero electromagnetic field: electromagnetic waves (i.e., light) can propagate through the vacuum. Similarly, the general theory predicts the existence of gravitational waves—disturbances of the spatiotemporal geometry that can exist even in the absence of any matter or energy and that propagate at the speed of light. There are, for example, many vacuum solutions of the Einstein field equations. One solution is Minkowski space-time, but other solutions contain gravitational waves.
Because general relativity concerns spatiotemporal structure, and because the trajectory of light rays is determined by the light-cone structure, general relativity must predict the gravitational bending of light. It is not clear whether Newtonian physics would predict any gravitational effect on light because that would depend on whether light feels any gravitational force, but light certainly does propagate through space-time. The effect of gravity on light was dramatically confirmed in Arthur S. Eddington's 1919 eclipse expedition, but is even more strikingly illustrated in the phenomenon of gravitational lensing: A galaxy positioned between the Earth and a more distant light source can act as a lens, focusing the light of the distant source on the Earth. Two astronauts traveling inertially could experience a similar effect; they could take different straight paths that both originate at their home planet and both end on Earth, going different ways around an intervening galaxy. Because the relativistic interval along those paths could differ, such astronauts could illustrate the twins paradox without any acceleration; twins coming from the distant planet could have different biological ages when they reunite on Earth, even though neither suffered any acceleration.
The spatiotemporal geometry of general relativity accounts for familiar gravitational phenomena, but the theory also has dramatic consequence at the cosmological scale and in extreme physical conditions. When a massive star burns through its nuclear fuel and collapses, for example, the increasing density of matter causes ever greater curvature in space-time. If the star is sufficiently massive, the light-cone structure deviates enough from Minkowski space-time to form a trapped surface: a region from which light cannot escape. The event horizon around a black hole is such a trapped surface; an object falling through the horizon can never send light, or any other signal, back to the exterior region. Once the infalling matter of the star reaches this point, it is destined to become ever more compressed without limit, and the curvature of the space-time will grow to infinity. If the equations continue to hold, this results in a space-time singularity; the spatiotemporal structure cannot be continued beyond a certain limit and space-time itself comes to an end. Because the spatiotemporal structure itself has become singular, it no longer makes any conceptual sense to ask what happens after the singularity; no meaning could be attached to the term after in the absence of spatiotemporal structure.
In the opposite temporal direction, the general theory also contains models in which the universe as a whole arises out of such a singularity, the singularity we call the big bang. Indeed, if general relativity is not modified, the observed motions of galaxies require that the universe began at a singularity, and that space-time itself has been expanding ever since. There is equally no sense to be made of the question what happened before the big bang because the spatiotemporal structure needed to define temporal priority would not extend beyond the initial singularity.
It is, of course, possible that the equations of the theory will be modified in some way so as to avoid the infinities and singularities, but that takes us from the analysis of general relativity into speculations about the replacement of general relativity. The mathematical structure of general relativity also admits of models of the theory with very peculiar spatiotemporal structures. Some models, for example, admit closed time-like curves, that is, time-like trajectories that loop back through space-time and meet up with themselves. In such a model, a person could in principle continue going always locally forward in time, but end up (as an adult) back at the events of their childhood. There seems to be no way to physically test this possibility (that is, there is no physical mechanism to produce closed time-like curves through laboratory operations), so it is unclear whether the existence of these mathematical models proves the physical possibility of such time travel or rather the physical impossibility of space-times that correspond to these mathematical solutions. In any case, general relativity provides a means for considering spatiotemporal structures unlike any that occur in classical physics.
The special and general theories of relativity provide a rich source of novel concepts of great interest to metaphysics. The topics that could be informed by these theories are too long even to list, but the most obvious metaphysical implications of the theories are worthy of remark. The nature of space and time occupies a central place in Immanuel Kant's Critique of Pure Reason, where supposed a priori knowledge of spatial and temporal structure provided grounds for the conclusion that space and time have no existence outside the faculty of intuition. After all, how could one know anything a priori about space and time if they exist outside the mind?
The theories of relativity simply refute the claim that there is any a priori knowledge of spatiotemporal structure. Even if relativity ultimately proves to be incorrect, everything in our everyday experience of the world can be accommodated in the relativistic spatiotemporal account. For all we know at present, we could be living in a relativistic universe, in which there is no Euclidean space and in which even time need not have a universal linear order. The nature of space and time is a matter of empirical inquiry, not a priori proof.
The special and general theories are also relevant to the question of the nature of space and time: Are they entities in their own right (as Newton supposed) or just relations among material bodies (as G.W. Leibniz insisted)? Taken at face value, the theories posit an independent existence to the four-dimensional space-time manifold. Even in the absence of material bodies, there is a spatiotemporal structure among the points of space-time. As the twins paradox shows, the observable behavior of material objects is determined by that structure. And even more dramatically, in general relativity the space-time manifold takes on a life of its own; gravitational waves can exist even in the absence of any material objects, and the presence of material objects influences the structure of space-time.
Attempts have been made to reformulate general relativity in a more relationist manner, in terms only of relations among material objects without commitment to any spatiotemporal structure of vacuum regions. These attempts have not succeeded. One can, of course, simply declare that in the general theory, space-time itself counts as a material entity, but then the argument seems to be only over labels rather than ontology.
Like all empirical theories, relativity is supported but not proven by observation. The spatiotemporal structure cannot be directly observed, but theories of matter couched in terms of the relativistic structure yield testable predictions that can be checked. The general theory, for example, has been checked by flying an atomic clock around the world and comparing its reading with an initially synchronized clock that remained on Earth. Because the trajectories of the clocks have different relativistic intervals, one can predict that the traveling clock will show a different elapsed time from the clock that remained behind—which it does. There may be other ways to explain the effect, but it is a natural consequence of the relativistic account of space-time structure.
Challenges to the theory of relativity are more likely to come from considerations of the compatibility of the theory with other fundamental physical theories than from direct empirical problems. It is, for example, a still unsolved problem how to reconcile quantum physics with the pure relativistic space-time structure, and another unsolved problem of how to produce a quantum theory of gravity. Most particularly, the observable violations of John Bell's inequality for events at space-like separation are difficult to account for in any theory that has no preferred simultaneity slices in its space-time. So the metaphysician ought not to take the account of space-time provided by relativity as definitive; progress in physics may well demand radical revision of the account of spatiotemporal structure. Still, relativity illustrates how empirical inquiry can lead to the revision of the most seemingly fundamental concepts, even those that were once taken as preconditions for any experience at all.
See also Bell, John, and Bell's Theorem; Eddington, Arthur Stanley; Einstein, Albert; Energy; Galileo Galilei; Geometry; Knowledge, A Priori; Laws, Scientific; Leibniz, Gottfried Wilhelm; Matter; Motion, A Historical Survey; Newton, Isaac; Philosophy of Physics; Quantum Mechanics; Space; Time.
Bell, John S. "How to Teach Special Relativity." In The Speakable and Unspeakable in Quantum Mechanics. Cambridge, U.K.: Cambridge University Press, 1987.
Einstein, Albert. "On The Electrodynamics of Moving Bodies." In The Principle of Relativity. New York: Dover, 1952.
Einstein, Albert. "The Foundations of General Relaivity." Annalen der Physik 49 (1916).
Geroch, Robert P. General Relativity from A to B. Chicago: University of Chicago Press, 1978.
Maudlin, Tim. Quantum Non-Locality and Relativity: Metaphysical Intimations of Modern Physics. Oxford: Basil Blackwell, 1994.
Misner, Charles, Kip S. Thorne, and John Archibald Wheeler. Gravitation. San Francisco: W. H. Freeman, 1973.
Taylor, E. F., and J. A. Wheeler. Spacetime Physics: Introduction to Special Relativity. New York: W. H. Freeman, 1992.
Tim Maudlin (2005)